Re: [PR49888, VTA] don't keep VALUEs bound to modified MEMs
On Jun 27, 2012, Richard Henderson r...@redhat.com wrote: On 06/26/2012 01:54 PM, Alexandre Oliva wrote: + track_stack_pointer (dst, src1, src2); Why does this function return a value then? During testing, I used an assert on the return value to catch cases that couldn't be handled. The comments before that function say: + ??? The return value, that was useful during testing, ended up + unused, but this single-use static function will be inlined, and + then the return value computation will be optimized out, so I'm + leaving it in. -- Alexandre Oliva, freedom fighterhttp://FSFLA.org/~lxoliva/ You must be the change you wish to see in the world. -- Gandhi Be Free! -- http://FSFLA.org/ FSF Latin America board member Free Software Evangelist Red Hat Brazil Compiler Engineer
Re: [testsuite] don't use lto plugin if it doesn't work
On Jun 27, 2012, Mike Stump mikest...@comcast.net wrote: On Jun 27, 2012, at 2:07 AM, Alexandre Oliva wrote: Why? We don't demand a working plugin. Indeed, we disable the use of the plugin if we find a linker that doesn't support it. We just don't account for the possibility of finding a linker that supports plugins, but that doesn't support the one we'll build later. If this is the preferred solution, then having configure check the 64-bitness of ld and turning off the plugin altogether on mismatches sounds like a reasonable course of action to me. I'd very be surprised if I asked for an i686 native build to package and install elsewhere, and didn't get a plugin just because the build-time linker wouldn't have been able to run the plugin. -- Alexandre Oliva, freedom fighterhttp://FSFLA.org/~lxoliva/ You must be the change you wish to see in the world. -- Gandhi Be Free! -- http://FSFLA.org/ FSF Latin America board member Free Software Evangelist Red Hat Brazil Compiler Engineer
Re: [PATCH] Add MULT_HIGHPART_EXPR
On Wed, Jun 27, 2012 at 02:37:08PM -0700, Richard Henderson wrote: I was sitting on this patch until I got around to fixing up Jakub's existing vector divmod code to use it. But seeing as how he's adding more uses, I think it's better to get it in earlier. Tested via a patch sent under separate cover that changes __builtin_alpha_umulh to immediately fold to MULT_HIGHPART_EXPR. Thanks. Here is an incremental patch on top of my patch from yesterday which expands some of the vector divisions/modulos using MULT_HIGHPART_EXPR instead of VEC_WIDEN_MULT_*_EXPR + VEC_PERM_EXPR if backend supports that. Improves code generated for ushort or short / or % on i?86 (slightly complicated by the fact that unfortunately even -mavx2 doesn't support vector by vector shifts for V{8,16}HImode (nor V{16,32}QImode), XOP does though). Ok for trunk? I'll look at using MULT_HIGHPART_EXPR in the pattern recognizer and vectorizing it as either of the sequences next. 2012-06-28 Jakub Jelinek ja...@redhat.com PR tree-optimization/53645 * tree-vect-generic.c (expand_vector_divmod): Use MULT_HIGHPART_EXPR instead of VEC_WIDEN_MULT_{HI,LO}_EXPR followed by VEC_PERM_EXPR if possible. * gcc.c-torture/execute/pr53645-2.c: New test. --- gcc/tree-vect-generic.c.jj 2012-06-28 08:32:50.0 +0200 +++ gcc/tree-vect-generic.c 2012-06-28 09:10:51.436748834 +0200 @@ -455,7 +455,7 @@ expand_vector_divmod (gimple_stmt_iterat unsigned HOST_WIDE_INT mask = GET_MODE_MASK (TYPE_MODE (TREE_TYPE (type))); optab op; tree *vec; - unsigned char *sel; + unsigned char *sel = NULL; tree cur_op, mhi, mlo, mulcst, perm_mask, wider_type, tem; if (prec HOST_BITS_PER_WIDE_INT) @@ -744,26 +744,34 @@ expand_vector_divmod (gimple_stmt_iterat if (mode == -2 || BYTES_BIG_ENDIAN != WORDS_BIG_ENDIAN) return NULL_TREE; - op = optab_for_tree_code (VEC_WIDEN_MULT_LO_EXPR, type, optab_default); - if (op == NULL - || optab_handler (op, TYPE_MODE (type)) == CODE_FOR_nothing) -return NULL_TREE; - op = optab_for_tree_code (VEC_WIDEN_MULT_HI_EXPR, type, optab_default); - if (op == NULL - || optab_handler (op, TYPE_MODE (type)) == CODE_FOR_nothing) -return NULL_TREE; - sel = XALLOCAVEC (unsigned char, nunits); - for (i = 0; i nunits; i++) -sel[i] = 2 * i + (BYTES_BIG_ENDIAN ? 0 : 1); - if (!can_vec_perm_p (TYPE_MODE (type), false, sel)) -return NULL_TREE; - wider_type -= build_vector_type (build_nonstandard_integer_type (prec * 2, unsignedp), -nunits / 2); - if (GET_MODE_CLASS (TYPE_MODE (wider_type)) != MODE_VECTOR_INT - || GET_MODE_BITSIZE (TYPE_MODE (wider_type)) -!= GET_MODE_BITSIZE (TYPE_MODE (type))) -return NULL_TREE; + op = optab_for_tree_code (MULT_HIGHPART_EXPR, type, optab_default); + if (op != NULL + optab_handler (op, TYPE_MODE (type)) != CODE_FOR_nothing) +wider_type = NULL_TREE; + else +{ + op = optab_for_tree_code (VEC_WIDEN_MULT_LO_EXPR, type, optab_default); + if (op == NULL + || optab_handler (op, TYPE_MODE (type)) == CODE_FOR_nothing) + return NULL_TREE; + op = optab_for_tree_code (VEC_WIDEN_MULT_HI_EXPR, type, optab_default); + if (op == NULL + || optab_handler (op, TYPE_MODE (type)) == CODE_FOR_nothing) + return NULL_TREE; + sel = XALLOCAVEC (unsigned char, nunits); + for (i = 0; i nunits; i++) + sel[i] = 2 * i + (BYTES_BIG_ENDIAN ? 0 : 1); + if (!can_vec_perm_p (TYPE_MODE (type), false, sel)) + return NULL_TREE; + wider_type + = build_vector_type (build_nonstandard_integer_type (prec * 2, +unsignedp), +nunits / 2); + if (GET_MODE_CLASS (TYPE_MODE (wider_type)) != MODE_VECTOR_INT + || GET_MODE_BITSIZE (TYPE_MODE (wider_type)) +!= GET_MODE_BITSIZE (TYPE_MODE (type))) + return NULL_TREE; +} cur_op = op0; @@ -772,7 +780,7 @@ expand_vector_divmod (gimple_stmt_iterat case 0: gcc_assert (unsignedp); /* t1 = oprnd0 pre_shift; -t2 = (type) (t1 w* ml prec); +t2 = t1 h* ml; q = t2 post_shift; */ cur_op = add_rshift (gsi, type, cur_op, pre_shifts); if (cur_op == NULL_TREE) @@ -801,30 +809,37 @@ expand_vector_divmod (gimple_stmt_iterat for (i = 0; i nunits; i++) vec[i] = build_int_cst (TREE_TYPE (type), mulc[i]); mulcst = build_vector (type, vec); - for (i = 0; i nunits; i++) -vec[i] = build_int_cst (TREE_TYPE (type), sel[i]); - perm_mask = build_vector (type, vec); - mhi = gimplify_build2 (gsi, VEC_WIDEN_MULT_HI_EXPR, wider_type, -cur_op, mulcst); - mhi = gimplify_build1 (gsi, VIEW_CONVERT_EXPR, type, mhi); - mlo = gimplify_build2 (gsi, VEC_WIDEN_MULT_LO_EXPR, wider_type, -cur_op, mulcst); - mlo = gimplify_build1 (gsi,
Re: [testsuite] don't use lto plugin if it doesn't work
On Thu, Jun 28, 2012 at 04:16:55AM -0300, Alexandre Oliva wrote: On Jun 27, 2012, Mike Stump mikest...@comcast.net wrote: On Jun 27, 2012, at 2:07 AM, Alexandre Oliva wrote: Why? We don't demand a working plugin. Indeed, we disable the use of the plugin if we find a linker that doesn't support it. We just don't account for the possibility of finding a linker that supports plugins, but that doesn't support the one we'll build later. If this is the preferred solution, then having configure check the 64-bitness of ld and turning off the plugin altogether on mismatches sounds like a reasonable course of action to me. I'd very be surprised if I asked for an i686 native build to package and install elsewhere, and didn't get a plugin just because the build-time linker wouldn't have been able to run the plugin. Not disable plugin support altogether, but disable assuming the linker supports the plugin. If user uses explicit -f{,no-}use-linker-plugin, it is his problem to care that the linker has support. But the problem is that when build-time ld is new enough gcc assumes it has to support the plugin. And that is not the case. Jakub
Re: [PATCH][configure] Make sure CFLAGS_FOR_TARGET And CXXFLAGS_FOR_TARGET contain -O2
On Jun 27, 2012, Christophe Lyon christophe.l...@st.com wrote: I looked at the patch in there, and I'm afraid I don't understand how it achieves the ChangeLog-suggested purpose of ensuring -O2 makes to C*FLAGS_FOR_TARGET, when all it appears to do is to prepend -g. Can you please clarify? With more context, the current code fragment is: CFLAGS_FOR_TARGET=$CFLAGS case $CFLAGS in * -O2 *) ;; *) CFLAGS_FOR_TARGET=-O2 $CFLAGS ;; esac case $CFLAGS in * -g * | * -g3 *) ;; *) CFLAGS_FOR_TARGET=-g $CFLAGS ;; esac where pre-pending -g discards -O2 if it was pre-pended just above. I see, thanks for clarifying. I suggest changing both occurrences of $CFLAGS within the case statements, then; the more uniform logic is more appealing to me. Patch approved with these changes. Thanks, -- Alexandre Oliva, freedom fighterhttp://FSFLA.org/~lxoliva/ You must be the change you wish to see in the world. -- Gandhi Be Free! -- http://FSFLA.org/ FSF Latin America board member Free Software Evangelist Red Hat Brazil Compiler Engineer
[Patch, Fortran] Handle C_F_POINTER with a noncontiguous SHAPE=
This patch generates inline code for C_F_POINTER with an array argument. One reason is that GCC didn't handle SHAPE= arguments which were noncontiguous. However, the real motivation is the fortran-dev branch with the new array-descriptor: C_F_POINTER needs then to set the stride multiplier, but as it doesn't know the size of a single element, one had either to pass the value or handle it partially in the front end. Hence, doing it all in the front-end was simpler. The C_F_Pointer issue is the main cause for failing test cases on the branch, though several other issues remain. Build and regtested on x86-64-linux- OK for the trunk? * * * If you wonder why I had some problems before: http://gcc.gnu.org/ml/fortran/2012-04/msg00115.html The reason is that I called pushlevel() twice for body: + gfc_start_block (body); + gfc_start_scalarized_body (loop, body); I removed the first one - and now it works. (Well, there were also some other issues in the patch, which are now fixed.) Tobias PS: After committal, I will update the patch for the branch; let's see how many failures will remain on the branch. PPS: The offset handling in gfortran is really complicated. I wonder whether we have to (or at least should) change it for the new array descriptor. 2012-06-27 Tobias Burnus bur...@net-b.de * trans-expr.c (conv_isocbinding_procedure): Generate c_f_pointer code inline. 2012-06-27 Tobias Burnus bur...@net-b.de * gfortran.dg/c_f_pointer_shape_tests_5.f90: New. * gfortran.dg/c_f_pointer_tests_3.f90: Update scan-tree-dump-times pattern. diff --git a/gcc/fortran/trans-expr.c b/gcc/fortran/trans-expr.c index 7d1a6d4..9ebde9d 100644 --- a/gcc/fortran/trans-expr.c +++ b/gcc/fortran/trans-expr.c @@ -3307,14 +3351,17 @@ conv_isocbinding_procedure (gfc_se * se, gfc_symbol * sym, return 1; } - else if ((sym-intmod_sym_id == ISOCBINDING_F_POINTER - arg-next-expr-rank == 0) + else if (sym-intmod_sym_id == ISOCBINDING_F_POINTER || sym-intmod_sym_id == ISOCBINDING_F_PROCPOINTER) { - /* Convert c_f_pointer if fptr is a scalar - and convert c_f_procpointer. */ + /* Convert c_f_pointer and c_f_procpointer. */ gfc_se cptrse; gfc_se fptrse; + gfc_se shapese; + gfc_ss *ss, *shape_ss; + tree desc, dim, tmp, stride, offset; + stmtblock_t body, block, ifblock; + gfc_loopinfo loop; gfc_init_se (cptrse, NULL); gfc_conv_expr (cptrse, arg-expr); @@ -3322,25 +3369,113 @@ conv_isocbinding_procedure (gfc_se * se, gfc_symbol * sym, gfc_add_block_to_block (se-post, cptrse.post); gfc_init_se (fptrse, NULL); - if (sym-intmod_sym_id == ISOCBINDING_F_POINTER - || gfc_is_proc_ptr_comp (arg-next-expr, NULL)) - fptrse.want_pointer = 1; + if (arg-next-expr-rank == 0) + { + if (sym-intmod_sym_id == ISOCBINDING_F_POINTER + || gfc_is_proc_ptr_comp (arg-next-expr, NULL)) + fptrse.want_pointer = 1; + + gfc_conv_expr (fptrse, arg-next-expr); + gfc_add_block_to_block (se-pre, fptrse.pre); + gfc_add_block_to_block (se-post, fptrse.post); + if (arg-next-expr-symtree-n.sym-attr.proc_pointer + arg-next-expr-symtree-n.sym-attr.dummy) + fptrse.expr = build_fold_indirect_ref_loc (input_location, + fptrse.expr); + se-expr = fold_build2_loc (input_location, MODIFY_EXPR, + TREE_TYPE (fptrse.expr), + fptrse.expr, + fold_convert (TREE_TYPE (fptrse.expr), + cptrse.expr)); + return 1; + } - gfc_conv_expr (fptrse, arg-next-expr); - gfc_add_block_to_block (se-pre, fptrse.pre); - gfc_add_block_to_block (se-post, fptrse.post); - - if (arg-next-expr-symtree-n.sym-attr.proc_pointer - arg-next-expr-symtree-n.sym-attr.dummy) - fptrse.expr = build_fold_indirect_ref_loc (input_location, - fptrse.expr); - - se-expr = fold_build2_loc (input_location, MODIFY_EXPR, - TREE_TYPE (fptrse.expr), - fptrse.expr, - fold_convert (TREE_TYPE (fptrse.expr), - cptrse.expr)); + gfc_start_block (block); + + /* Get the descriptor of the Fortran pointer. */ + ss = gfc_walk_expr (arg-next-expr); + gcc_assert (ss != gfc_ss_terminator); + fptrse.descriptor_only = 1; + gfc_conv_expr_descriptor (fptrse, arg-next-expr, ss); + gfc_add_block_to_block (block, fptrse.pre); + desc = fptrse.expr; + + /* Set data value, dtype, and offset. */ + tmp = GFC_TYPE_ARRAY_DATAPTR_TYPE (TREE_TYPE (desc)); + gfc_conv_descriptor_data_set (block, desc, +fold_convert (tmp, cptrse.expr)); + gfc_add_modify (block, gfc_conv_descriptor_dtype (desc), + gfc_get_dtype (TREE_TYPE (desc))); + + /* Start scalarization of the bounds, using the shape argument. */ + + shape_ss = gfc_walk_expr (arg-next-next-expr); + gcc_assert (shape_ss != gfc_ss_terminator); + gfc_init_se (shapese, NULL); + + gfc_init_loopinfo
Re: [onlinedocs]: No more automatic rebuilt?
libgomp.texi is still using gpl.texi, although libgomp has been relicensed to GPLv3 in 2009. OK? (This is the last use of gpl.texi in the gcc sources. Perhaps it should be removed and gpl_v3.texi renamed back to gpl.texi?) Andreas. * libgomp.texi: Include gpl_v3.texi instead of gpl.texi. diff --git a/libgomp/libgomp.texi b/libgomp/libgomp.texi index 29c078b..f8996f4 100644 --- a/libgomp/libgomp.texi +++ b/libgomp/libgomp.texi @@ -7,7 +7,7 @@ @copying -Copyright @copyright{} 2006, 2007, 2008, 2010, 2011 Free Software Foundation, Inc. +Copyright @copyright{} 2006, 2007, 2008, 2010, 2011, 2012 Free Software Foundation, Inc. Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.3 or @@ -1737,7 +1737,7 @@ Bugs in the GNU OpenMP implementation should be reported via @c GNU General Public License @c - -@include gpl.texi +@include gpl_v3.texi -- 1.7.11.1 -- Andreas Schwab, sch...@linux-m68k.org GPG Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5 And now for something completely different.
Re: [onlinedocs]: No more automatic rebuilt?
On Thu, Jun 28, 2012 at 10:18:49AM +0200, Andreas Schwab wrote: libgomp.texi is still using gpl.texi, although libgomp has been relicensed to GPLv3 in 2009. OK? Yes. * libgomp.texi: Include gpl_v3.texi instead of gpl.texi. Jakub
Re: [RFA] Enable dump-noaddr test to work in out of build tree testing
On 27/06/12 21:35, Andrew Pinski wrote: On Wed, Jun 27, 2012 at 3:33 AM, Matthew Gretton-Dann matthew.gretton-d...@arm.com wrote: All, This patch enables the dump-noaddr test to work in out-of-build-tree testing. [snip] I created a much simpler patch which I have been meaning to submit. I attached it for reference. Thanks, Andrew Pinski ChangeLog: * testsuite/gcc.c-torture/unsorted/dump-noaddr.x (dump_compare): Use an absolute dump base instead of a relative one. Index: gcc.c-torture/unsorted/dump-noaddr.x === --- gcc.c-torture/unsorted/dump-noaddr.x(revision 61452) +++ gcc.c-torture/unsorted/dump-noaddr.x(revision 61453) @@ -11,10 +11,10 @@ proc dump_compare { src options } { foreach option $option_list { file delete -force dump1 file mkdir dump1 - c-torture-compile $src $option $options -dumpbase dump1/$dumpbase -DMASK=1 -x c --param ggc-min-heapsize=1 -fdump-ipa-all -fdump-rtl-all -fdump-tree-all -fdump-noaddr + c-torture-compile $src $option $options -dumpbase [pwd]/dump1/$dumpbase -DMASK=1 -x c --param ggc-min-heapsize=1 -fdump-ipa-all -fdump-rtl-all -fdump-tree-all -fdump-noaddr file delete -force dump2 file mkdir dump2 - c-torture-compile $src $option $options -dumpbase dump2/$dumpbase -DMASK=2 -x c -fdump-ipa-all -fdump-rtl-all -fdump-tree-all -fdump-noaddr + c-torture-compile $src $option $options -dumpbase [pwd]/dump2/$dumpbase -DMASK=2 -x c -fdump-ipa-all -fdump-rtl-all -fdump-tree-all -fdump-noaddr foreach dump1 [lsort [glob -nocomplain dump1/*]] { regsub dump1/ $dump1 dump2/ dump2 set dumptail gcc.c-torture/unsorted/[file tail $dump1] What I don't like about this approach is that dump1 and dump2 are created in the current working directory. With out of build-tree testing this may not (I believe) be the same as $tmpdir (where temporaries are normally created). Also the current directory may already contain directories/files called dump1 or dump2 which will get destroyed by running the testsuite. Hence why my approach used tmpdir. Does this reasoning make sense? I've not committed my version yet in case I am missing something in my reasoning above with regards to the relationship between the current working directory and $tmpdir. Thanks, Matt -- Matthew Gretton-Dann Principal Engineer, PD Software - Tools, ARM Ltd
RE: [PATCH] Disable loop2_invariant for -Os
diff --git a/gcc/loop-init.c b/gcc/loop-init.c index 03f8f61..5d8cf73 100644 --- a/gcc/loop-init.c +++ b/gcc/loop-init.c @@ -273,6 +273,12 @@ struct rtl_opt_pass pass_rtl_loop_done = static bool gate_rtl_move_loop_invariants (void) { + /* In general, invariant motion can not reduce code size. But it + will + change the liverange of the invariant, which increases the + register + pressure and might lead to more spilling. */ + if (optimize_function_for_size_p (cfun)) + return false; + Can you do this per loop instead? Using optimize_loop_nest_for_size_p? Update it according to the comments. Thanks! -Zhenqiang diff --git a/gcc/loop-invariant.c b/gcc/loop-invariant.c index f8405dd..b0e84a7 100644 --- a/gcc/loop-invariant.c +++ b/gcc/loop-invariant.c @@ -1931,7 +1931,8 @@ move_loop_invariants (void) curr_loop = loop; /* move_single_loop_invariants for very large loops is time consuming and might need a lot of memory. */ - if (loop-num_nodes = (unsigned) LOOP_INVARIANT_MAX_BBS_IN_LOOP) + if (loop-num_nodes = (unsigned) LOOP_INVARIANT_MAX_BBS_IN_LOOP + ! optimize_loop_nest_for_size_p (loop)) move_single_loop_invariants (loop); } ChangeLog: 2012-06-28 Zhenqiang Chen zhenqiang.c...@arm.com * loop-invariant.c (move_loop_invariants): Skip move_single_loop_invariants when optimizing loop for size
RE: [PATCH] Disable loop2_invariant for -Os
-Original Message- From: Steven Bosscher [mailto:stevenb@gmail.com] Sent: 2012年6月27日 16:54 To: Zhenqiang Chen Cc: gcc-patches@gcc.gnu.org Subject: Re: [PATCH] Disable loop2_invariant for -Os On Wed, Jun 27, 2012 at 10:40 AM, Zhenqiang Chen zhenqiang.c...@arm.com wrote: Hi, In general, invariant motion itself can not reduce code size. But it will change the liverange of the invariant, which might lead to more spilling. This may be true for ARM but it's not true in general. Sometimes loop-invariant Benchmark tests show it also benefits MIPS, PPC and X86 for code size. address arithmetic, that is not exposed in GIMPLE, is profitable to hoist out of the loop. See e.g. PR41026 (for which I still have a patch in the queue). If this goes in anyway, please mention PR39837 in your ChangeLog entry. It can not handle the case. Thanks! -Zhenqiang
Re: [PATCH] Move Graphite from using PPL over to ISL
On 06/27/2012 05:06 PM, Richard Guenther wrote: This merges from the graphite branch the move of PPL to ISL, and completes it where it was lacking - thanks to Micha. It leaves unmerged the addition of a pluto-like ISL optimizer as well as a bugfix for stride 1 which did not come with a testcase. With this patch (ontop of the one requiring ClooG 0.17.0) we will require ISL 0.10 for enabling Graphite. I've bootstrapped and built various combinations with in-tree and out-of-tree cloog and ISL, so I'm pretty confident that this works. With out-of-tree ClooG and ISL a slightly older patch ontop of its prerequesite passed bootstrap and testing on x86_64-unknown-linux-gnu. Currently re-bootstrapping and testing on x86_64-unknown-linux-gnu. Ok for trunk? Hi Richard, hi Micha, thanks a lot for pushing this forward. Especially the fast implementation of the interchange heuristic was impressive! I am fine with the general goal and think the patch is close to get in, but I would like to give feedback on the interchange heuristic. I will try to review it today or tomorrow. Thanks again!! Tobias
Re: [ARM Patch 1/n] PR53447: optimizations of 64bit ALU operation with constant
Hi Ramana Thanks for the review, please see my inlined comments. On Thu, Jun 28, 2012 at 12:02 AM, Ramana Radhakrishnan ramana.radhakrish...@linaro.org wrote: On 8 June 2012 10:12, Carrot Wei car...@google.com wrote: Hi In rtl expression, substract a constant c is expressed as add a value -c, so it is alse processed by adddi3, and I extend it more to handle a subtraction of 64bit constant. I created an insn pattern arm_subdi3_immediate to specifically represent substraction with 64bit constant while continue keeping the add rtl expression. Sorry about the time it has taken to review this patch -Thanks for tackling this but I'm not convinced that this patch is correct and definitely can be more efficient. The range of valid 64 bit constants allowed would be in my opinion are the following- obtained by dividing the 64 bit constant into 2 32 bit halves (upper32 and lower32 referred to as upper and lower below) arm_not_operand (upper) arm_add_operand (lower) which boils down to the valid combination of adds lo : adc hi - both positive constants. adds lo ; sbc hi - lower positive, upper negative I assume you mean sbc -hi or sbc abs(hi), similar for following instructions subs lo ; sbc hi - lower negative, upper negative subs lo ; adc hi - lower negative, upper positive My first version did the similar thing, but in some cases subs and adds may generate different carry flag. Assume the low word is 0 and high word is negative, your method will generate adds r0, r0, 0 sbc r1, r1, abs(hi) My method generates subs r0, r0, 0 sbc r1, r1, abs(hi) ARM's definition of subs is (result, carry, overflow) = AddWithCarry(R[n], NOT(imm32), ‘1’); So the subs instruction will set carry flag, but adds clear carry flag, and finally generate different result in r1. Therefore I'd do the following - * Don't make *arm_adddi3 a named pattern - we don't need that. * Change the *addsi3_carryin_optab pattern to be something like this : --- a/gcc/config/arm/arm.md +++ b/gcc/config/arm/arm.md @@ -1001,12 +1001,14 @@ ) (define_insn *addsi3_carryin_optab - [(set (match_operand:SI 0 s_register_operand =r) - (plus:SI (plus:SI (match_operand:SI 1 s_register_operand %r) - (match_operand:SI 2 arm_rhs_operand rI)) + [(set (match_operand:SI 0 s_register_operand =r,r) + (plus:SI (plus:SI (match_operand:SI 1 s_register_operand %r,r + (match_operand:SI 2 arm_not_operand rI,K Do you mean arm_add_operand? (LTUGEU:SI (reg:cnb CC_REGNUM) (const_int 0] TARGET_32BIT - adc%?\\t%0, %1, %2 + @ + adc%?\\t%0, %1, %2 + sbc%?\\t%0, %1, %#n2 [(set_attr conds use)] ) * I'd like a new const_ok_for_dimode_op function that dealt with each of these operations, thus your plus operation with a DImode constant would just be a check similar to what I've said above. Good idea, it will make the interface cleaner. I will do it later. * You then don't need the new subdi3_immediate pattern and the split can happen after reload. Adjust predicates and constraints accordingly, delete it. Also please use CONST_INT_P instead of Even if I delete subdi3_immediate pattern, we still need the predicates and constraints to represent the negative di numbers in other patterns. thanks Carrot
Re: [PATCH][configure] Make sure CFLAGS_FOR_TARGET And CXXFLAGS_FOR_TARGET contain -O2
On 28.06.2012 09:32, Alexandre Oliva wrote: I suggest changing both occurrences of $CFLAGS within the case statements, then; the more uniform logic is more appealing to me. Patch approved with these changes. Thanks, Thanks; here is an updated version taking your comment into account. Can you commit it for me (I don't have write access). Thanks. Christophe. 2012-06-28 Christophe Lyon christophe.l...@st.com * configure.ac (CFLAGS_FOR_TARGET, CXXFLAGS_FOR_TARGET): Make sure they contain -O2. * configure: Regenerate. diff --git a/configure b/configure index 083f2ce..1ab12db 100755 --- a/configure +++ b/configure @@ -6690,11 +6690,11 @@ if test x$CFLAGS_FOR_TARGET = x; then CFLAGS_FOR_TARGET=$CFLAGS case $CFLAGS in * -O2 *) ;; -*) CFLAGS_FOR_TARGET=-O2 $CFLAGS ;; +*) CFLAGS_FOR_TARGET=-O2 $CFLAGS_FOR_TARGET ;; esac case $CFLAGS in * -g * | * -g3 *) ;; -*) CFLAGS_FOR_TARGET=-g $CFLAGS ;; +*) CFLAGS_FOR_TARGET=-g $CFLAGS_FOR_TARGET ;; esac fi @@ -6703,11 +6703,11 @@ if test x$CXXFLAGS_FOR_TARGET = x; then CXXFLAGS_FOR_TARGET=$CXXFLAGS case $CXXFLAGS in * -O2 *) ;; -*) CXXFLAGS_FOR_TARGET=-O2 $CXXFLAGS ;; +*) CXXFLAGS_FOR_TARGET=-O2 $CXXFLAGS_FOR_TARGET ;; esac case $CXXFLAGS in * -g * | * -g3 *) ;; -*) CXXFLAGS_FOR_TARGET=-g $CXXFLAGS ;; +*) CXXFLAGS_FOR_TARGET=-g $CXXFLAGS_FOR_TARGET ;; esac fi diff --git a/configure.ac b/configure.ac index 378e9f5..82dbe4c 100644 --- a/configure.ac +++ b/configure.ac @@ -2145,11 +2145,11 @@ if test x$CFLAGS_FOR_TARGET = x; then CFLAGS_FOR_TARGET=$CFLAGS case $CFLAGS in * -O2 *) ;; -*) CFLAGS_FOR_TARGET=-O2 $CFLAGS ;; +*) CFLAGS_FOR_TARGET=-O2 $CFLAGS_FOR_TARGET ;; esac case $CFLAGS in * -g * | * -g3 *) ;; -*) CFLAGS_FOR_TARGET=-g $CFLAGS ;; +*) CFLAGS_FOR_TARGET=-g $CFLAGS_FOR_TARGET ;; esac fi AC_SUBST(CFLAGS_FOR_TARGET) @@ -2158,11 +2158,11 @@ if test x$CXXFLAGS_FOR_TARGET = x; then CXXFLAGS_FOR_TARGET=$CXXFLAGS case $CXXFLAGS in * -O2 *) ;; -*) CXXFLAGS_FOR_TARGET=-O2 $CXXFLAGS ;; +*) CXXFLAGS_FOR_TARGET=-O2 $CXXFLAGS_FOR_TARGET ;; esac case $CXXFLAGS in * -g * | * -g3 *) ;; -*) CXXFLAGS_FOR_TARGET=-g $CXXFLAGS ;; +*) CXXFLAGS_FOR_TARGET=-g $CXXFLAGS_FOR_TARGET ;; esac fi AC_SUBST(CXXFLAGS_FOR_TARGET)
Re: [PATCH] Disable loop2_invariant for -Os
On Thu, Jun 28, 2012 at 10:33 AM, Zhenqiang Chen zhenqiang.c...@arm.com wrote: diff --git a/gcc/loop-init.c b/gcc/loop-init.c index 03f8f61..5d8cf73 100644 --- a/gcc/loop-init.c +++ b/gcc/loop-init.c @@ -273,6 +273,12 @@ struct rtl_opt_pass pass_rtl_loop_done = static bool gate_rtl_move_loop_invariants (void) { + /* In general, invariant motion can not reduce code size. But it + will + change the liverange of the invariant, which increases the + register + pressure and might lead to more spilling. */ + if (optimize_function_for_size_p (cfun)) + return false; + Can you do this per loop instead? Using optimize_loop_nest_for_size_p? Update it according to the comments. Thanks! -Zhenqiang diff --git a/gcc/loop-invariant.c b/gcc/loop-invariant.c index f8405dd..b0e84a7 100644 --- a/gcc/loop-invariant.c +++ b/gcc/loop-invariant.c @@ -1931,7 +1931,8 @@ move_loop_invariants (void) curr_loop = loop; /* move_single_loop_invariants for very large loops is time consuming and might need a lot of memory. */ - if (loop-num_nodes = (unsigned) LOOP_INVARIANT_MAX_BBS_IN_LOOP) + if (loop-num_nodes = (unsigned) LOOP_INVARIANT_MAX_BBS_IN_LOOP + ! optimize_loop_nest_for_size_p (loop)) move_single_loop_invariants (loop); Wait - move_single_loop_invariants itself already uses optimize_loop_for_speed_p. And looking down it seems to have support for tracking spill cost (eventually only with -fira-loop-pressure) - please work out why this support is not working for you. Richard. } ChangeLog: 2012-06-28 Zhenqiang Chen zhenqiang.c...@arm.com * loop-invariant.c (move_loop_invariants): Skip move_single_loop_invariants when optimizing loop for size
Re: [ARM Patch 1/n] PR53447: optimizations of 64bit ALU operation with constant
On 28 June 2012 10:03, Carrot Wei car...@google.com wrote: Hi Ramana Thanks for the review, please see my inlined comments. On Thu, Jun 28, 2012 at 12:02 AM, Ramana Radhakrishnan ramana.radhakrish...@linaro.org wrote: On 8 June 2012 10:12, Carrot Wei car...@google.com wrote: Hi In rtl expression, substract a constant c is expressed as add a value -c, so it is alse processed by adddi3, and I extend it more to handle a subtraction of 64bit constant. I created an insn pattern arm_subdi3_immediate to specifically represent substraction with 64bit constant while continue keeping the add rtl expression. Sorry about the time it has taken to review this patch -Thanks for tackling this but I'm not convinced that this patch is correct and definitely can be more efficient. The range of valid 64 bit constants allowed would be in my opinion are the following- obtained by dividing the 64 bit constant into 2 32 bit halves (upper32 and lower32 referred to as upper and lower below) arm_not_operand (upper) arm_add_operand (lower) which boils down to the valid combination of adds lo : adc hi - both positive constants. adds lo ; sbc hi - lower positive, upper negative I assume you mean sbc -hi or sbc abs(hi), similar for following instructions hi = ~upper32 lower = lower 32 bits of the constant hi = ~ (upper32 bits) of the constant ( bitwise twiddle not a negate :) ) For e.g. unsigned long long foo4 (unsigned long long x) { return x - 0x25ULL; } should be subs r0, r0, #37 sbc r1, r1, #0 Notice that it's #0 and not 1 . :) subs lo ; sbc hi - lower negative, upper negative subs lo ; adc hi - lower negative, upper positive My first version did the similar thing, but in some cases subs and adds may generate different carry flag. Assume the low word is 0 and high word is negative, your method will generate adds r0, r0, 0 sbc r1, r1, abs(hi) No it will generate adds r0, r0, #0 sbcr1, r1, ~hi and not abs (hi) My method generates subs r0, r0, 0 sbc r1, r1, abs(hi) ARM's definition of subs is (result, carry, overflow) = AddWithCarry(R[n], NOT(imm32), ‘1’); So the subs instruction will set carry flag, but adds clear carry flag, and finally generate different result in r1. Therefore I'd do the following - * Don't make *arm_adddi3 a named pattern - we don't need that. * Change the *addsi3_carryin_optab pattern to be something like this : --- a/gcc/config/arm/arm.md +++ b/gcc/config/arm/arm.md @@ -1001,12 +1001,14 @@ ) (define_insn *addsi3_carryin_optab - [(set (match_operand:SI 0 s_register_operand =r) - (plus:SI (plus:SI (match_operand:SI 1 s_register_operand %r) - (match_operand:SI 2 arm_rhs_operand rI)) + [(set (match_operand:SI 0 s_register_operand =r,r) + (plus:SI (plus:SI (match_operand:SI 1 s_register_operand %r,r + (match_operand:SI 2 arm_not_operand rI,K Do you mean arm_add_operand? No I mean arm_not_operand and it was a deliberate choice as explained above. (LTUGEU:SI (reg:cnb CC_REGNUM) (const_int 0] TARGET_32BIT - adc%?\\t%0, %1, %2 + @ + adc%?\\t%0, %1, %2 + sbc%?\\t%0, %1, %#n2 [(set_attr conds use)] ) * I'd like a new const_ok_for_dimode_op function that dealt with each of these operations, thus your plus operation with a DImode constant would just be a check similar to what I've said above. Good idea, it will make the interface cleaner. I will do it later. I think it should help with a clean interface for all the operations you plan to add. * You then don't need the new subdi3_immediate pattern and the split can happen after reload. Adjust predicates and constraints accordingly, delete it. Also please use CONST_INT_P instead of Even if I delete subdi3_immediate pattern, we still need the predicates and constraints to represent the negative di numbers in other patterns. I agree you need the predicate - I suspect you can get away with a single constraint for all valid add immediate DImode operands especially if you are splitting it later to the constituent forms. regards, Ramana thanks Carrot
Re: [PATCH, gdc] - Merging gdc (GNU D Compiler) into gcc
On 27 June 2012 19:17, Mike Stump mikest...@comcast.net wrote: On Jun 27, 2012, at 7:45 AM, Iain Buclaw wrote: I do have a question though, what is available for the transition of development from git to svn? Other than a lot of ready and getting used to the various switches and commands on my part. Why transition? Quite a few people around here use git on a day to day basis and just push and pull to/from svn as they see fit. gcc has a read-only git repo you can track and pull from. For pushing into svn, you can use git to do that as well (dcommit). You'll want to read up on work flows on the net... as dcommit and merges require a little extra caution that isn't obvious. I did not know of this, thanks. I'll be sure to look it up. -- Iain Buclaw *(p e ? p++ : p) = (c 0x0f) + '0';
Re: [ARM Patch 1/n] PR53447: optimizations of 64bit ALU operation with constant
On 28/06/12 10:03, Carrot Wei wrote: Hi Ramana Thanks for the review, please see my inlined comments. On Thu, Jun 28, 2012 at 12:02 AM, Ramana Radhakrishnan ramana.radhakrish...@linaro.org wrote: On 8 June 2012 10:12, Carrot Wei car...@google.com wrote: Hi In rtl expression, substract a constant c is expressed as add a value -c, so it is alse processed by adddi3, and I extend it more to handle a subtraction of 64bit constant. I created an insn pattern arm_subdi3_immediate to specifically represent substraction with 64bit constant while continue keeping the add rtl expression. Sorry about the time it has taken to review this patch -Thanks for tackling this but I'm not convinced that this patch is correct and definitely can be more efficient. The range of valid 64 bit constants allowed would be in my opinion are the following- obtained by dividing the 64 bit constant into 2 32 bit halves (upper32 and lower32 referred to as upper and lower below) arm_not_operand (upper) arm_add_operand (lower) which boils down to the valid combination of adds lo : adc hi - both positive constants. adds lo ; sbc hi - lower positive, upper negative I assume you mean sbc -hi or sbc abs(hi), similar for following instructions No, it's sbc ~hi -- bitwise inversion It all falls out from the specification, where adc == X + Y + C and sbc == X + ~Y + C. Hence the need to use arm_not_operand. R.
Re: [patch] support for multiarch systems
Hi! On Mon, 25 Jun 2012 18:19:26 +0200, Matthias Klose d...@ubuntu.com wrote: On 25.06.2012 15:56, Joseph S. Myers wrote: On Mon, 25 Jun 2012, Matthias Klose wrote: Please find attached the patch updated for trunk 20120625, x86 only, tested on x86-linux-gnu, KFreeBSD and the Hurd. 2012-06-25 Matthias Klose d...@ubuntu.com * doc/invoke.texi: Document -print-multiarch. * doc/install.texi: Document --enable-multiarch. * doc/fragments.texi: Document MULTILIB_OSDIRNAMES, MULTIARCH_DIRNAME. * configure.ac: Add --enable-multiarch option. * configure.in: Regenerate. * Makefile.in (s-mlib): Pass MULTIARCH_DIRNAME to genmultilib. enable_multiarch, with_float: New macros. if_multiarch: New macro, define in terms of enable_multiarch. * genmultilib: Add new argument for the multiarch name. * gcc.c (multiarch_dir): Define. (for_each_path): Search for multiarch suffixes. (driver_handle_option): Handle multiarch option. (do_spec_1): Pass -imultiarch if defined. (main): Print multiarch. (set_multilib_dir): Separate multilib and multiarch names from multilib_select. (print_multilib_info): Ignore multiarch names in multilib_select. * incpath.c (add_standard_paths): Search the multiarch include dirs. * cppdeault.h (default_include): Document multiarch in multilib member. * cppdefault.c: [LOCAL_INCLUDE_DIR, STANDARD_INCLUDE_DIR] Add an include directory for multiarch directories. * common.opt: New options --print-multiarch and -imultilib. * config.gcc: Add tmake fragments to tmake_file ( i386/t-kfreebsd for i[34567]86-*-kfreebsd*-gnu and x86_64-*-kfreebsd*-gnu, i386/t-gnu for i[34567]86-*-gnu*). * config/i386/t-kfreebsd: Add multiarch names in MULTILIB_OSDIRNAMES, define MULTIARCH_DIRNAME. * config/i386/t-linux64: Likewise. * config/i386/t-linux: Define MULTIARCH_DIRNAME. * config/i386/t-gnu: Likewise. As I said before, »config/i386/t-{gnu,kfreebsd,linux}« are new files. Instead of repeating: my comments from http://news.gmane.org/find-root.php?message_id=%3C87zk94cg1h.fsf%40schwinge.name%3E as well as the follow-up still hold. Index: genmultilib === --- genmultilib (revision 188931) +++ genmultilib (working copy) @@ -84,6 +84,8 @@ # This argument can be used together with MULTILIB_EXCEPTIONS and will take # effect after the MULTILIB_EXCEPTIONS. +# The optional eight argument is the multiarch name. »ninth argument«. Grüße, Thomas pgpZRoJXMiArK.pgp Description: PGP signature
Re: [C++ RFC / Patch] PR 51213 (access control under SFINAE)
On 06/15/2012 04:27 PM, Paolo Carlini wrote: Hi, as I mentioned a few days ago, I'm working on implementing this feature, which I personally consider rather high priority, from the library point of view too (eg, type_traits). I have been making some progress - I'm attaching below what I have so far in my local tree - but I also think it's time to get feedback both about the general approach and about more specific issues with the testsuite. ... any comments on this? Thanks! Paolo.
Re: [Ada] Attribute 'Old should only be used in postconditions
2012-06-26 Yannick Moy m...@adacore.com * sem_attr.adb (Analyze_Attribute): Detect if 'Old is used outside a postcondition, and issue an error in such a case. This has introduced the following failures in the gnat.dg testsuite: FAIL: gnat.dg/deep_old.adb (test for excess errors) FAIL: gnat.dg/old_errors.adb (test for errors, line 7) FAIL: gnat.dg/old_errors.adb (test for errors, line 16) FAIL: gnat.dg/old_errors.adb (test for errors, line 28) FAIL: gnat.dg/old_errors.adb (test for errors, line 34) FAIL: gnat.dg/old_errors.adb (test for errors, line 38) FAIL: gnat.dg/old_errors.adb (test for warnings, line 40) FAIL: gnat.dg/old_errors.adb (test for errors, line 44) FAIL: gnat.dg/old_errors.adb (test for excess errors) What should we do about them? -- Eric Botcazou
Re: [Ada] Attribute 'Old should only be used in postconditions
* sem_attr.adb (Analyze_Attribute): Detect if 'Old is used outside a postcondition, and issue an error in such a case. This has introduced the following failures in the gnat.dg testsuite: FAIL: gnat.dg/deep_old.adb (test for excess errors) FAIL: gnat.dg/old_errors.adb (test for errors, line 7) FAIL: gnat.dg/old_errors.adb (test for errors, line 16) FAIL: gnat.dg/old_errors.adb (test for errors, line 28) FAIL: gnat.dg/old_errors.adb (test for errors, line 34) FAIL: gnat.dg/old_errors.adb (test for errors, line 38) FAIL: gnat.dg/old_errors.adb (test for warnings, line 40) FAIL: gnat.dg/old_errors.adb (test for errors, line 44) FAIL: gnat.dg/old_errors.adb (test for excess errors) What should we do about them? Probably suppress both, since they no longer make sense (they are testing an early implementation of 'Old, before 'Old was standardized in Ada 2012). I'll take care of it. Arno
Re: [PATCH] Add generic vector lowering for integer division and modulus (PR tree-optimization/53645)
On Wed, 27 Jun 2012, Jakub Jelinek wrote: Hi! This patch makes veclower2 attempt to emit integer division/modulus of vectors by constants using vector multiplication, shifts or masking. It is somewhat similar to the vect_recog_divmod_pattern, but it needs to analyze everything first, see if all divisions or modulos are doable using the same sequence of vector insns, and then emit vector insns as opposed to the scalar ones the pattern recognizer adds. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? Ok. I wonder what to do for -O0 though - shouldn't we not call expand_vector_divmod in that case? Thus, + if (!optimize || !VECTOR_INTEGER_TYPE_P (type) || TREE_CODE (rhs2) != VECTOR_CST) + break; ? Thanks, Richard. The testcase additionally eyeballed even for -mavx2, which unlike -mavx has vector vector shifts. 2012-06-27 Jakub Jelinek ja...@redhat.com PR tree-optimization/53645 * tree-vect-generic.c (add_rshift): New function. (expand_vector_divmod): New function. (expand_vector_operation): Use it for vector integer TRUNC_{DIV,MOD}_EXPR by VECTOR_CST. * tree-vect-patterns.c (vect_recog_divmod_pattern): Replace unused lguup variable with dummy_int. * gcc.c-torture/execute/pr53645.c: New test. --- gcc/tree-vect-generic.c.jj2012-06-26 10:00:42.935832834 +0200 +++ gcc/tree-vect-generic.c 2012-06-27 10:15:20.534103045 +0200 @@ -391,6 +391,515 @@ expand_vector_comparison (gimple_stmt_it return t; } +/* Helper function of expand_vector_divmod. Gimplify a RSHIFT_EXPR in type + of OP0 with shift counts in SHIFTCNTS array and return the temporary holding + the result if successful, otherwise return NULL_TREE. */ +static tree +add_rshift (gimple_stmt_iterator *gsi, tree type, tree op0, int *shiftcnts) +{ + optab op; + unsigned int i, nunits = TYPE_VECTOR_SUBPARTS (type); + bool scalar_shift = true; + + for (i = 1; i nunits; i++) +{ + if (shiftcnts[i] != shiftcnts[0]) + scalar_shift = false; +} + + if (scalar_shift shiftcnts[0] == 0) +return op0; + + if (scalar_shift) +{ + op = optab_for_tree_code (RSHIFT_EXPR, type, optab_scalar); + if (op != NULL +optab_handler (op, TYPE_MODE (type)) != CODE_FOR_nothing) + return gimplify_build2 (gsi, RSHIFT_EXPR, type, op0, + build_int_cst (NULL_TREE, shiftcnts[0])); +} + + op = optab_for_tree_code (RSHIFT_EXPR, type, optab_vector); + if (op != NULL + optab_handler (op, TYPE_MODE (type)) != CODE_FOR_nothing) +{ + tree *vec = XALLOCAVEC (tree, nunits); + for (i = 0; i nunits; i++) + vec[i] = build_int_cst (TREE_TYPE (type), shiftcnts[i]); + return gimplify_build2 (gsi, RSHIFT_EXPR, type, op0, + build_vector (type, vec)); +} + + return NULL_TREE; +} + +/* Try to expand integer vector division by constant using + widening multiply, shifts and additions. */ +static tree +expand_vector_divmod (gimple_stmt_iterator *gsi, tree type, tree op0, + tree op1, enum tree_code code) +{ + bool use_pow2 = true; + bool has_vector_shift = true; + int mode = -1, this_mode; + int pre_shift = -1, post_shift; + unsigned int nunits = TYPE_VECTOR_SUBPARTS (type); + int *shifts = XALLOCAVEC (int, nunits * 4); + int *pre_shifts = shifts + nunits; + int *post_shifts = pre_shifts + nunits; + int *shift_temps = post_shifts + nunits; + unsigned HOST_WIDE_INT *mulc = XALLOCAVEC (unsigned HOST_WIDE_INT, nunits); + int prec = TYPE_PRECISION (TREE_TYPE (type)); + int dummy_int; + unsigned int i, unsignedp = TYPE_UNSIGNED (TREE_TYPE (type)); + unsigned HOST_WIDE_INT mask = GET_MODE_MASK (TYPE_MODE (TREE_TYPE (type))); + optab op; + tree *vec; + unsigned char *sel; + tree cur_op, mhi, mlo, mulcst, perm_mask, wider_type, tem; + + if (prec HOST_BITS_PER_WIDE_INT) +return NULL_TREE; + + op = optab_for_tree_code (RSHIFT_EXPR, type, optab_vector); + if (op == NULL + || optab_handler (op, TYPE_MODE (type)) == CODE_FOR_nothing) +has_vector_shift = false; + + /* Analysis phase. Determine if all op1 elements are either power + of two and it is possible to expand it using shifts (or for remainder + using masking). Additionally compute the multiplicative constants + and pre and post shifts if the division is to be expanded using + widening or high part multiplication plus shifts. */ + for (i = 0; i nunits; i++) +{ + tree cst = VECTOR_CST_ELT (op1, i); + unsigned HOST_WIDE_INT ml; + + if (!host_integerp (cst, unsignedp) || integer_zerop (cst)) + return NULL_TREE; + pre_shifts[i] = 0; + post_shifts[i] = 0; + mulc[i] = 0; + if (use_pow2 +(!integer_pow2p (cst) || tree_int_cst_sgn (cst) != 1)) +
Re: [patch] support for multiarch systems
On 28.06.2012 12:01, Thomas Schwinge wrote: Hi! On Mon, 25 Jun 2012 18:19:26 +0200, Matthias Klose d...@ubuntu.com wrote: On 25.06.2012 15:56, Joseph S. Myers wrote: On Mon, 25 Jun 2012, Matthias Klose wrote: Please find attached the patch updated for trunk 20120625, x86 only, tested on x86-linux-gnu, KFreeBSD and the Hurd. 2012-06-25 Matthias Klose d...@ubuntu.com * doc/invoke.texi: Document -print-multiarch. * doc/install.texi: Document --enable-multiarch. * doc/fragments.texi: Document MULTILIB_OSDIRNAMES, MULTIARCH_DIRNAME. * configure.ac: Add --enable-multiarch option. * configure.in: Regenerate. * Makefile.in (s-mlib): Pass MULTIARCH_DIRNAME to genmultilib. enable_multiarch, with_float: New macros. if_multiarch: New macro, define in terms of enable_multiarch. * genmultilib: Add new argument for the multiarch name. * gcc.c (multiarch_dir): Define. (for_each_path): Search for multiarch suffixes. (driver_handle_option): Handle multiarch option. (do_spec_1): Pass -imultiarch if defined. (main): Print multiarch. (set_multilib_dir): Separate multilib and multiarch names from multilib_select. (print_multilib_info): Ignore multiarch names in multilib_select. * incpath.c (add_standard_paths): Search the multiarch include dirs. * cppdeault.h (default_include): Document multiarch in multilib member. * cppdefault.c: [LOCAL_INCLUDE_DIR, STANDARD_INCLUDE_DIR] Add an include directory for multiarch directories. * common.opt: New options --print-multiarch and -imultilib. * config.gcc: Add tmake fragments to tmake_file ( i386/t-kfreebsd for i[34567]86-*-kfreebsd*-gnu and x86_64-*-kfreebsd*-gnu, i386/t-gnu for i[34567]86-*-gnu*). * config/i386/t-kfreebsd: Add multiarch names in MULTILIB_OSDIRNAMES, define MULTIARCH_DIRNAME. * config/i386/t-linux64: Likewise. * config/i386/t-linux: Define MULTIARCH_DIRNAME. * config/i386/t-gnu: Likewise. As I said before, »config/i386/t-{gnu,kfreebsd,linux}« are new files. Instead of repeating: my comments from http://news.gmane.org/find-root.php?message_id=%3C87zk94cg1h.fsf%40schwinge.name%3E as well as the follow-up still hold. Like * config/i386/t-gnu: New, define MULTIARCH_DIRNAME. ? Index: genmultilib === --- genmultilib (revision 188931) +++ genmultilib (working copy) @@ -84,6 +84,8 @@ # This argument can be used together with MULTILIB_EXCEPTIONS and will take # effect after the MULTILIB_EXCEPTIONS. +# The optional eight argument is the multiarch name. »ninth argument«. fixed.
Re: [onlinedocs]: No more automatic rebuilt?
On Thu, 28 Jun 2012, Andreas Schwab wrote: libgomp.texi is still using gpl.texi, although libgomp has been relicensed to GPLv3 in 2009. OK? Looks good, thank you. (This is the last use of gpl.texi in the gcc sources. Perhaps it should be removed and gpl_v3.texi renamed back to gpl.texi?) If it's not used any more, yes, please go ahead an remove it. As for renaming gpl_v3.texi to gpl.texi, I'm not sure. Gerald
Re: [patch] support for multiarch systems
Hi! On Thu, 28 Jun 2012 12:42:23 +0200, Matthias Klose d...@ubuntu.com wrote: On 28.06.2012 12:01, Thomas Schwinge wrote: On Mon, 25 Jun 2012 18:19:26 +0200, Matthias Klose d...@ubuntu.com wrote: On 25.06.2012 15:56, Joseph S. Myers wrote: On Mon, 25 Jun 2012, Matthias Klose wrote: Please find attached the patch updated for trunk 20120625, x86 only, tested on x86-linux-gnu, KFreeBSD and the Hurd. 2012-06-25 Matthias Klose d...@ubuntu.com * doc/invoke.texi: Document -print-multiarch. * doc/install.texi: Document --enable-multiarch. * doc/fragments.texi: Document MULTILIB_OSDIRNAMES, MULTIARCH_DIRNAME. * configure.ac: Add --enable-multiarch option. * configure.in: Regenerate. * Makefile.in (s-mlib): Pass MULTIARCH_DIRNAME to genmultilib. enable_multiarch, with_float: New macros. if_multiarch: New macro, define in terms of enable_multiarch. * genmultilib: Add new argument for the multiarch name. * gcc.c (multiarch_dir): Define. (for_each_path): Search for multiarch suffixes. (driver_handle_option): Handle multiarch option. (do_spec_1): Pass -imultiarch if defined. (main): Print multiarch. (set_multilib_dir): Separate multilib and multiarch names from multilib_select. (print_multilib_info): Ignore multiarch names in multilib_select. * incpath.c (add_standard_paths): Search the multiarch include dirs. * cppdeault.h (default_include): Document multiarch in multilib member. * cppdefault.c: [LOCAL_INCLUDE_DIR, STANDARD_INCLUDE_DIR] Add an include directory for multiarch directories. * common.opt: New options --print-multiarch and -imultilib. * config.gcc: Add tmake fragments to tmake_file ( i386/t-kfreebsd for i[34567]86-*-kfreebsd*-gnu and x86_64-*-kfreebsd*-gnu, i386/t-gnu for i[34567]86-*-gnu*). * config/i386/t-kfreebsd: Add multiarch names in MULTILIB_OSDIRNAMES, define MULTIARCH_DIRNAME. * config/i386/t-linux64: Likewise. * config/i386/t-linux: Define MULTIARCH_DIRNAME. * config/i386/t-gnu: Likewise. As I said before, »config/i386/t-{gnu,kfreebsd,linux}« are new files. Instead of repeating: my comments from http://news.gmane.org/find-root.php?message_id=%3C87zk94cg1h.fsf%40schwinge.name%3E as well as the follow-up still hold. Like * config/i386/t-gnu: New, define MULTIARCH_DIRNAME. ? I'd use: * config/i386/t-gnu: New file. * config/i386/t-kfreebsd: Likewise. * config/i386/t-linux: Likewise. Plus the following instead of your changes: gcc/ * config.gcc i[34567]86-*-linux* | x86_64-*-linux* (tmake_file): Include i386/t-linux. i[34567]86-*-kfreebsd*-gnu | x86_64-*-kfreebsd*-gnu (tmake_file): Include i386/t-kfreebsd. i[34567]86-*-gnu* (tmake_file): Include i386/t-gnu. diff --git a/gcc/config.gcc b/gcc/config.gcc index 7ec184c..39c70f2 100644 --- a/gcc/config.gcc +++ b/gcc/config.gcc @@ -3481,9 +3481,14 @@ case ${target} in i[34567]86-*-darwin* | x86_64-*-darwin*) ;; - i[34567]86-*-linux* | x86_64-*-linux* | \ - i[34567]86-*-kfreebsd*-gnu | x86_64-*-kfreebsd*-gnu | \ - i[34567]86-*-gnu*) + i[34567]86-*-linux* | x86_64-*-linux*) + tmake_file=$tmake_file i386/t-linux + ;; + i[34567]86-*-kfreebsd*-gnu | x86_64-*-kfreebsd*-gnu) + tmake_file=$tmake_file i386/t-kfreebsd + ;; + i[34567]86-*-gnu*) + tmake_file=$tmake_file i386/t-gnu ;; i[34567]86-*-solaris2* | x86_64-*-solaris2.1[0-9]*) ;; Otherwise, I can't imagine how that would work. Grüße, Thomas pgpJmqjTH8LJD.pgp Description: PGP signature
Re: [ARM Patch 1/n] PR53447: optimizations of 64bit ALU operation with constant
On Thu, Jun 28, 2012 at 5:37 PM, Ramana Radhakrishnan ramana.radhakrish...@linaro.org wrote: On 28 June 2012 10:03, Carrot Wei car...@google.com wrote: Hi Ramana Thanks for the review, please see my inlined comments. On Thu, Jun 28, 2012 at 12:02 AM, Ramana Radhakrishnan ramana.radhakrish...@linaro.org wrote: On 8 June 2012 10:12, Carrot Wei car...@google.com wrote: Hi In rtl expression, substract a constant c is expressed as add a value -c, so it is alse processed by adddi3, and I extend it more to handle a subtraction of 64bit constant. I created an insn pattern arm_subdi3_immediate to specifically represent substraction with 64bit constant while continue keeping the add rtl expression. Sorry about the time it has taken to review this patch -Thanks for tackling this but I'm not convinced that this patch is correct and definitely can be more efficient. The range of valid 64 bit constants allowed would be in my opinion are the following- obtained by dividing the 64 bit constant into 2 32 bit halves (upper32 and lower32 referred to as upper and lower below) arm_not_operand (upper) arm_add_operand (lower) which boils down to the valid combination of adds lo : adc hi - both positive constants. adds lo ; sbc hi - lower positive, upper negative I assume you mean sbc -hi or sbc abs(hi), similar for following instructions hi = ~upper32 lower = lower 32 bits of the constant hi = ~ (upper32 bits) of the constant ( bitwise twiddle not a negate :) ) For e.g. unsigned long long foo4 (unsigned long long x) { return x - 0x25ULL; } should be subs r0, r0, #37 sbc r1, r1, #0 Notice that it's #0 and not 1 . :) subs lo ; sbc hi - lower negative, upper negative subs lo ; adc hi - lower negative, upper positive Thank you for the detailed explanation. So the four cases should be adds lo : adc hi - both positive constants. adds lo ; sbc ~hi - lower positive, upper negative subs -lo ; sbc ~hi - lower negative, upper negative subs -lo ; adc hi - lower negative, upper positive My first version did the similar thing, but in some cases subs and adds may generate different carry flag. Assume the low word is 0 and high word is negative, your method will generate adds r0, r0, 0 sbc r1, r1, abs(hi) No it will generate adds r0, r0, #0 sbc r1, r1, ~hi and not abs (hi) My method generates subs r0, r0, 0 sbc r1, r1, abs(hi) ARM's definition of subs is (result, carry, overflow) = AddWithCarry(R[n], NOT(imm32), ‘1’); So the subs instruction will set carry flag, but adds clear carry flag, and finally generate different result in r1. Therefore I'd do the following - * Don't make *arm_adddi3 a named pattern - we don't need that. * Change the *addsi3_carryin_optab pattern to be something like this : --- a/gcc/config/arm/arm.md +++ b/gcc/config/arm/arm.md @@ -1001,12 +1001,14 @@ ) (define_insn *addsi3_carryin_optab - [(set (match_operand:SI 0 s_register_operand =r) - (plus:SI (plus:SI (match_operand:SI 1 s_register_operand %r) - (match_operand:SI 2 arm_rhs_operand rI)) + [(set (match_operand:SI 0 s_register_operand =r,r) + (plus:SI (plus:SI (match_operand:SI 1 s_register_operand %r,r + (match_operand:SI 2 arm_not_operand rI,K Do you mean arm_add_operand? No I mean arm_not_operand and it was a deliberate choice as explained above. (LTUGEU:SI (reg:cnb CC_REGNUM) (const_int 0] TARGET_32BIT - adc%?\\t%0, %1, %2 + @ + adc%?\\t%0, %1, %2 + sbc%?\\t%0, %1, %#n2 Since constraint K is logical not, not negative, should the last line be following? + sbc%?\\t%0, %1, #%B2 thanks Carrot
Re: [testsuite] don't use lto plugin if it doesn't work
On Jun 28, 2012, Jakub Jelinek ja...@redhat.com wrote: On Thu, Jun 28, 2012 at 04:16:55AM -0300, Alexandre Oliva wrote: I'd very be surprised if I asked for an i686 native build to package and install elsewhere, and didn't get a plugin just because the build-time linker wouldn't have been able to run the plugin. Not disable plugin support altogether, but disable assuming the linker supports the plugin. That still doesn't sound right to me: why should the compiler refrain from using a perfectly functional linker plugin on the machine where it's installed (not where it's built)? Also, this scenario of silently deciding whether or not to use the linker plugin could bring us to different test results for the same command lines. I don't like that. -- Alexandre Oliva, freedom fighterhttp://FSFLA.org/~lxoliva/ You must be the change you wish to see in the world. -- Gandhi Be Free! -- http://FSFLA.org/ FSF Latin America board member Free Software Evangelist Red Hat Brazil Compiler Engineer
[PATCH] Fix PR53790
This fixes PR53790 - with MEM_REF you can get base decls of incomplete type. Deal with that. Bootstrapped and tested on x86_64-unknown-linux-gnu, applied everywhere. Richard. 2012-06-28 Richard Guenther rguent...@suse.de PR middle-end/53790 * expr.c (expand_expr_real_1): Verify if the type is complete before inspecting its size. * gcc.dg/torture/pr53790.c: New testcase. Index: gcc/expr.c === *** gcc/expr.c (revision 189041) --- gcc/expr.c (working copy) *** expand_expr_real_1 (tree exp, rtx target *** 9832,9837 --- 9832,9838 orig_op0 = op0 = expand_expr (tem, (TREE_CODE (TREE_TYPE (tem)) == UNION_TYPE + COMPLETE_TYPE_P (TREE_TYPE (tem)) (TREE_CODE (TYPE_SIZE (TREE_TYPE (tem))) != INTEGER_CST) modifier != EXPAND_STACK_PARM Index: gcc/testsuite/gcc.dg/torture/pr53790.c === *** gcc/testsuite/gcc.dg/torture/pr53790.c (revision 0) --- gcc/testsuite/gcc.dg/torture/pr53790.c (working copy) *** *** 0 --- 1,17 + /* { dg-do compile } */ + + typedef struct s { + int value; + } s_t; + + static inline int + read(s_t const *var) + { + return var-value; + } + + int main() + { + extern union u extern_var; + return read((s_t *)extern_var); + }
Re: [onlinedocs]: No more automatic rebuilt?
Gerald Pfeifer ger...@pfeifer.com writes: If it's not used any more, yes, please go ahead an remove it. Done as this, tested with make info. Andreas. * doc/include/gpl.texi: Remove. * doc/sourcebuild.texi (Texinfo Manuals): Don't mention gpl.texi. diff --git a/gcc/doc/include/gpl.texi b/gcc/doc/include/gpl.texi deleted file mode 100644 index bcb5535..000 [omitted] diff --git a/gcc/doc/sourcebuild.texi b/gcc/doc/sourcebuild.texi index 3d834ee..dc5cc47 100644 --- a/gcc/doc/sourcebuild.texi +++ b/gcc/doc/sourcebuild.texi @@ -1,4 +1,4 @@ -@c Copyright (C) 2002, 2003, 2004, 2005, 2007, 2008, 2009, 2010, 2011 +@c Copyright (C) 2002, 2003, 2004, 2005, 2007, 2008, 2009, 2010, 2011, 2012 @c Free Software Foundation, Inc. @c This is part of the GCC manual. @c For copying conditions, see the file gcc.texi. @@ -368,8 +368,7 @@ The GNU Free Documentation License. The section ``Funding Free Software''. @item gcc-common.texi Common definitions for manuals. -@item gpl.texi -@itemx gpl_v3.texi +@item gpl_v3.texi The GNU General Public License. @item texinfo.tex A copy of @file{texinfo.tex} known to work with the GCC manuals. -- 1.7.11.1 -- Andreas Schwab, sch...@linux-m68k.org GPG Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5 And now for something completely different.
Re: [testsuite] don't use lto plugin if it doesn't work
On Thu, Jun 28, 2012 at 1:39 PM, Alexandre Oliva aol...@redhat.com wrote: On Jun 28, 2012, Jakub Jelinek ja...@redhat.com wrote: On Thu, Jun 28, 2012 at 04:16:55AM -0300, Alexandre Oliva wrote: I'd very be surprised if I asked for an i686 native build to package and install elsewhere, and didn't get a plugin just because the build-time linker wouldn't have been able to run the plugin. Not disable plugin support altogether, but disable assuming the linker supports the plugin. That still doesn't sound right to me: why should the compiler refrain from using a perfectly functional linker plugin on the machine where it's installed (not where it's built)? Also, this scenario of silently deciding whether or not to use the linker plugin could bring us to different test results for the same command lines. I don't like that. I don't like that we derive the default setting this way either. In the end I would like us to arrive at the point that LTO does not work at all without a linker plugin. Richard. -- Alexandre Oliva, freedom fighter http://FSFLA.org/~lxoliva/ You must be the change you wish to see in the world. -- Gandhi Be Free! -- http://FSFLA.org/ FSF Latin America board member Free Software Evangelist Red Hat Brazil Compiler Engineer
Re: [ARM Patch 1/n] PR53447: optimizations of 64bit ALU operation with constant
subs -lo ; sbc ~hi - lower negative, upper negative subs -lo ; adc hi - lower negative, upper positive Yes. snip (LTUGEU:SI (reg:cnb CC_REGNUM) (const_int 0] TARGET_32BIT - adc%?\\t%0, %1, %2 + @ + adc%?\\t%0, %1, %2 + sbc%?\\t%0, %1, %#n2 Since constraint K is logical not, not negative, should the last line be following? + sbc%?\\t%0, %1, #%B2 Indeed that was a typo on my part. Sorry about that. Ramana
Re: [PATCH][configure] Make sure CFLAGS_FOR_TARGET And CXXFLAGS_FOR_TARGET contain -O2
On Jun 28, 2012, Christophe Lyon christophe.l...@st.com wrote: Can you commit it for me (I don't have write access). Done, GCC SVN and src CVS trees. Thanks! 2012-06-28 Christophe Lyon christophe.l...@st.com * configure.ac (CFLAGS_FOR_TARGET, CXXFLAGS_FOR_TARGET): Make sure they contain -O2. * configure: Regenerate. -- Alexandre Oliva, freedom fighterhttp://FSFLA.org/~lxoliva/ You must be the change you wish to see in the world. -- Gandhi Be Free! -- http://FSFLA.org/ FSF Latin America board member Free Software Evangelist Red Hat Brazil Compiler Engineer
[patch]: Fix PR53595 (hard_regno_call_part_clobbered called with invalid regno)
This patch returns false in HARD_REGNO_CALL_PART_CLOBBERED if !HARD_REGNO_MODE_OK. Returning true for such registers might lead to performance degradation that eat up all performance gained from 4.6 to 4.7 for example. Ok to apply? Johann PR 53595 * config/avr/avr.c (avr_hard_regno_call_part_clobbered): New. * config/avr/avr-protos.h (avr_hard_regno_call_part_clobbered): New. * config/avr/avr.h (HARD_REGNO_CALL_PART_CLOBBERED): Forward to avr_hard_regno_call_part_clobbered. Index: config/avr/avr-protos.h === --- config/avr/avr-protos.h (revision 189011) +++ config/avr/avr-protos.h (working copy) @@ -47,6 +47,7 @@ extern void init_cumulative_args (CUMULA #endif /* TREE_CODE */ #ifdef RTX_CODE +extern int avr_hard_regno_call_part_clobbered (unsigned, enum machine_mode); extern const char *output_movqi (rtx insn, rtx operands[], int *l); extern const char *output_movhi (rtx insn, rtx operands[], int *l); extern const char *output_movsisf (rtx insn, rtx operands[], int *l); Index: config/avr/avr.c === --- config/avr/avr.c (revision 189011) +++ config/avr/avr.c (working copy) @@ -8856,6 +8856,28 @@ avr_hard_regno_mode_ok (int regno, enum } +/* Implement `HARD_REGNO_CALL_PART_CLOBBERED'. */ + +int +avr_hard_regno_call_part_clobbered (unsigned regno, enum machine_mode mode) +{ + /* FIXME: This hook gets called with MODE:REGNO combinations that don't +represent valid hard registers like, e.g. HI:29. Returning TRUE +for such registers can lead to performance degradation as mentioned +in PR53595. Thus, report invalid hard registers as FALSE. */ + + if (!avr_hard_regno_mode_ok (regno, mode)) +return 0; + + /* Return true if any of the following boundaries is crossed: + 17/18, 27/28 and 29/30. */ + + return ((regno 18 regno + GET_MODE_SIZE (mode) 18) + || (regno REG_Y regno + GET_MODE_SIZE (mode) REG_Y) + || (regno REG_Z regno + GET_MODE_SIZE (mode) REG_Z)); +} + + /* Implement `MODE_CODE_BASE_REG_CLASS'. */ enum reg_class Index: config/avr/avr.h === --- config/avr/avr.h (revision 189011) +++ config/avr/avr.h (working copy) @@ -402,10 +402,8 @@ enum reg_class { #define REGNO_OK_FOR_INDEX_P(NUM) 0 -#define HARD_REGNO_CALL_PART_CLOBBERED(REGNO, MODE)\ - (((REGNO) 18 (REGNO) + GET_MODE_SIZE (MODE) 18) \ - || ((REGNO) REG_Y (REGNO) + GET_MODE_SIZE (MODE) REG_Y) \ - || ((REGNO) REG_Z (REGNO) + GET_MODE_SIZE (MODE) REG_Z)) +#define HARD_REGNO_CALL_PART_CLOBBERED(REGNO, MODE) \ + avr_hard_regno_call_part_clobbered (REGNO, MODE) #define TARGET_SMALL_REGISTER_CLASSES_FOR_MODE_P hook_bool_mode_true
[PATCH] MIPS/libgcc: Add soft-fp support for SDE bare-iron targets
Hello, This change adds soft-fp support for SDE bare-iron targets. The settings have been mostly based on the version already present in glibc, except that the ABI variations have been merged into a single file and conditionalised on preprocessor macros (and the file reformatted to follow the GNU coding standard that the glibc variants don't). Only n32 has to be treated somewhat specially as it is ILP32 but its long long type is 64-bit with native support (using single registers rather than pairs). The rest is handled generically, based on the width of the types chosen. This has been regression tested for the mips-sde-elf target with no new failures, using the o32 and n64 ABI multilibs, with and without -msoft-float, o32 also with MIPS16 variants. There's currently no SDE runtime support for n32, however despite the unability to test I decided the configuration shouldn't be pessimised by default (by avoiding the special exception and using the 32-bit long type) as glibc already uses such an arrangement so it's been verified elsewhere and if a platform that supports the n32 ABI decides later on to enable soft-fp too, it will be verified in libgcc anyway. I believe this is reasonable and avoids the risk of someone chooing the long type by omission. Comments or questions are welcome, otherwise OK to apply? 2012-06-28 Catherine Moore c...@codesourcery.com Maciej W. Rozycki ma...@codesourcery.com libgcc/ * config/mips/sfp-machine.h: New file. * config.host mips*-sde-elf*: Enable soft-fp. Maciej gcc-mips-softfp.diff Index: gcc-trunk-4.6/libgcc/config/mips/sfp-machine.h === --- /dev/null 1970-01-01 00:00:00.0 + +++ gcc-trunk-4.6/libgcc/config/mips/sfp-machine.h 2012-06-24 14:38:40.083663725 +0100 @@ -0,0 +1,101 @@ +#if defined _ABIN32 _MIPS_SIM == _ABIN32 + +#define _FP_W_TYPE_SIZE64 +#define _FP_W_TYPE unsigned long long +#define _FP_WS_TYPEsigned long long +#define _FP_I_TYPE long long + +#else + +#define _FP_W_TYPE_SIZE_MIPS_SZLONG +#define _FP_W_TYPE unsigned long +#define _FP_WS_TYPEsigned long +#define _FP_I_TYPE long + +#endif + +#if _FP_W_TYPE_SIZE 64 + +#define _FP_MUL_MEAT_S(R, X, Y)\ + _FP_MUL_MEAT_1_wide (_FP_WFRACBITS_S, R, X, Y, umul_ppmm) +#define _FP_MUL_MEAT_D(R, X, Y)\ + _FP_MUL_MEAT_2_wide (_FP_WFRACBITS_D, R, X, Y, umul_ppmm) +#define _FP_MUL_MEAT_Q(R, X, Y)\ + _FP_MUL_MEAT_4_wide (_FP_WFRACBITS_Q, R, X, Y, umul_ppmm) + +#define _FP_DIV_MEAT_S(R, X, Y)\ + _FP_DIV_MEAT_1_udiv_norm (S, R, X, Y) +#define _FP_DIV_MEAT_D(R, X, Y)\ + _FP_DIV_MEAT_2_udiv (D, R, X, Y) +#define _FP_DIV_MEAT_Q(R, X, Y)\ + _FP_DIV_MEAT_4_udiv (Q, R, X, Y) + +#else + +#define _FP_MUL_MEAT_S(R, X, Y)\ + _FP_MUL_MEAT_1_imm (_FP_WFRACBITS_S, R, X, Y) +#define _FP_MUL_MEAT_D(R, X, Y)\ + _FP_MUL_MEAT_1_wide (_FP_WFRACBITS_D, R, X, Y, umul_ppmm) +#define _FP_MUL_MEAT_Q(R, X, Y)\ + _FP_MUL_MEAT_2_wide_3mul (_FP_WFRACBITS_Q, R, X, Y, umul_ppmm) + +#define _FP_DIV_MEAT_S(R, X, Y)\ + _FP_DIV_MEAT_1_imm (S, R, X, Y, _FP_DIV_HELP_imm) +#define _FP_DIV_MEAT_D(R, X, Y)\ + _FP_DIV_MEAT_1_udiv_norm (D, R, X, Y) +#define _FP_DIV_MEAT_Q(R, X, Y)\ + _FP_DIV_MEAT_2_udiv (Q, R, X, Y) + +#endif + +#define _FP_NANFRAC_S ((_FP_QNANBIT_S 1) - 1) +#define _FP_NANFRAC_D ((_FP_QNANBIT_D 1) - 1), -1 +#define _FP_NANFRAC_Q ((_FP_QNANBIT_Q 1) - 1), -1, -1, -1 +#define _FP_NANSIGN_S 0 +#define _FP_NANSIGN_D 0 +#define _FP_NANSIGN_Q 0 + +#define _FP_KEEPNANFRACP 1 +/* From my experiments it seems X is chosen unless one of the + NaNs is sNaN, in which case the result is NANSIGN/NANFRAC. */ +#define _FP_CHOOSENAN(fs, wc, R, X, Y, OP) \ + do { \ +if ((_FP_FRAC_HIGH_RAW_##fs (X)\ +| _FP_FRAC_HIGH_RAW_##fs (Y)) _FP_QNANBIT_##fs) \ + {\ + R##_s = _FP_NANSIGN_##fs; \ +_FP_FRAC_SET_##wc (R, _FP_NANFRAC_##fs); \ + }\ +else \ + {
Re: [PATCH] Move Graphite from using PPL over to ISL
On 12-06-27 11:06 , Richard Guenther wrote: 2012-06-27 Richard Guenther rguent...@suse.de Michael Matz m...@suse.de Tobias Grosser tob...@grosser.es Sebastian Pop seb...@gmail.com config/ * cloog.m4: Set up to work against ISL only. * isl.m4: New file. * Makefile.def: Add ISL host module, remove PPL host module. Adjust ClooG host module to use the proper ISL. * Makefile.tpl: Pass ISL include flags instead of PPL ones. * configure.ac: Include config/isl.m4. Add ISL host library, remove PPL. Remove PPL configury, add ISL configury, adjust ClooG configury. * Makefile.in: Regenerated. * configure: Likewise. gcc/ * Makefile.in: Remove PPL flags in favor of ISL ones. (BACKENDLIBS): Remove PPL libs. (INCLUDES): Remove PPL includes in favor of ISL ones. (graphite-clast-to-gimple.o): Remove graphite-dependences.h and graphite-cloog-compat.h dependencies. (graphite-dependences.o): Likewise. (graphite-poly.o): Likewise. * configure.ac: Declare ISL vars instead of PPL ones. * configure: Regenerated. * doc/install.texi: Replace PPL requirement documentation with ISL one. * graphite-blocking.c: Remove PPL code, add ISL equivalent. * graphite-clast-to-gimple.c: Likewise. * graphite-dependences.c: Likewise. * graphite-interchange.c: Likewise. * graphite-poly.h: Likewise. * graphite-poly.c: Likewise. * graphite-sese-to-poly.c: Likewise. * graphite.c: Likewise. * graphite-scop-detection.c: Re-arrange includes. * graphite-cloog-util.c: Remove. * graphite-cloog-util.h: Likewise. * graphite-ppl.h: Likewise. * graphite-ppl.c: Likewise. * graphite-dependences.h: Likewise. libgomp/ * testsuite/libgomp.graphite/force-parallel-4.c: Adjust. * testsuite/libgomp.graphite/force-parallel-5.c: Likewise. * testsuite/libgomp.graphite/force-parallel-7.c: Likewise. * testsuite/libgomp.graphite/force-parallel-8.c: Likewise. OK. Diego.
Re: [patch]: Fix PR53595 (hard_regno_call_part_clobbered called with invalid regno)
2012/6/28 Georg-Johann Lay a...@gjlay.de: This patch returns false in HARD_REGNO_CALL_PART_CLOBBERED if !HARD_REGNO_MODE_OK. Returning true for such registers might lead to performance degradation that eat up all performance gained from 4.6 to 4.7 for example. Ok to apply? Johann PR 53595 * config/avr/avr.c (avr_hard_regno_call_part_clobbered): New. * config/avr/avr-protos.h (avr_hard_regno_call_part_clobbered): New. * config/avr/avr.h (HARD_REGNO_CALL_PART_CLOBBERED): Forward to avr_hard_regno_call_part_clobbered. Please, apply. Denis
Re: [RFA] Enable dump-noaddr test to work in out of build tree testing
On Jun 28, 2012, at 1:28 AM, Matthew Gretton-Dann matthew.gretton-d...@arm.com wrote: On 27/06/12 21:35, Andrew Pinski wrote: On Wed, Jun 27, 2012 at 3:33 AM, Matthew Gretton-Dann matthew.gretton-d...@arm.com wrote: All, This patch enables the dump-noaddr test to work in out-of-build-tree testing. [snip] I created a much simpler patch which I have been meaning to submit. I attached it for reference. Thanks, Andrew Pinski ChangeLog: * testsuite/gcc.c-torture/unsorted/dump-noaddr.x (dump_compare): Use an absolute dump base instead of a relative one. Index: gcc.c-torture/unsorted/dump-noaddr.x === --- gcc.c-torture/unsorted/dump-noaddr.x(revision 61452) +++ gcc.c-torture/unsorted/dump-noaddr.x(revision 61453) @@ -11,10 +11,10 @@ proc dump_compare { src options } { foreach option $option_list { file delete -force dump1 file mkdir dump1 -c-torture-compile $src $option $options -dumpbase dump1/$dumpbase -DMASK=1 -x c --param ggc-min-heapsize=1 -fdump-ipa-all -fdump-rtl-all -fdump-tree-all -fdump-noaddr +c-torture-compile $src $option $options -dumpbase [pwd]/dump1/$dumpbase -DMASK=1 -x c --param ggc-min-heapsize=1 -fdump-ipa-all -fdump-rtl-all -fdump-tree-all -fdump-noaddr file delete -force dump2 file mkdir dump2 -c-torture-compile $src $option $options -dumpbase dump2/$dumpbase -DMASK=2 -x c -fdump-ipa-all -fdump-rtl-all -fdump-tree-all -fdump-noaddr +c-torture-compile $src $option $options -dumpbase [pwd]/dump2/$dumpbase -DMASK=2 -x c -fdump-ipa-all -fdump-rtl-all -fdump-tree-all -fdump-noaddr foreach dump1 [lsort [glob -nocomplain dump1/*]] { regsub dump1/ $dump1 dump2/ dump2 set dumptail gcc.c-torture/unsorted/[file tail $dump1] What I don't like about this approach is that dump1 and dump2 are created in the current working directory. On vxworks as I recall we did a cd to tmpdir, is that generally true? Also, if one telnets in or sshes into the host under test, the cd is mandatory... as otherwise one would dump turds (that's a technical term) in the home directory which would be very uncool. Maybe a better approach would be to cd to the right place if all the Canadian setups cd, as that then unifies them. With out of build-tree testing this may not (I believe) be the same as $tmpdir (where temporaries are normally created). Also the current directory may already contain directories/files called dump1 or dump2 which will get destroyed by running the The point of the cd was to get to a place where temps can be created freely... I've not committed my version yet in case I am missing something in my reasoning above with regards to the relationship between the current working directory and $tmpdir. So the question would be, does his patch work for you? It was unclear to me if the answer is no. Oh, wait, I know what I don't like about Andrew's patch, pwd, is that the directory on the target, the host or the build machine? And is that going to the host machine? They are not the same. One needs a directory on the host machine.
Re: [RFA] Enable dump-noaddr test to work in out of build tree testing
On 28/06/12 14:38, Mike Stump wrote: On Jun 28, 2012, at 1:28 AM, Matthew Gretton-Dann matthew.gretton-d...@arm.com wrote: On 27/06/12 21:35, Andrew Pinski wrote: On Wed, Jun 27, 2012 at 3:33 AM, Matthew Gretton-Dann matthew.gretton-d...@arm.com wrote: All, This patch enables the dump-noaddr test to work in out-of-build-tree testing. [snip] I created a much simpler patch which I have been meaning to submit. I attached it for reference. Thanks, Andrew Pinski ChangeLog: * testsuite/gcc.c-torture/unsorted/dump-noaddr.x (dump_compare): Use an absolute dump base instead of a relative one. Index: gcc.c-torture/unsorted/dump-noaddr.x === --- gcc.c-torture/unsorted/dump-noaddr.x(revision 61452) +++ gcc.c-torture/unsorted/dump-noaddr.x(revision 61453) @@ -11,10 +11,10 @@ proc dump_compare { src options } { foreach option $option_list { file delete -force dump1 file mkdir dump1 -c-torture-compile $src $option $options -dumpbase dump1/$dumpbase -DMASK=1 -x c --param ggc-min-heapsize=1 -fdump-ipa-all -fdump-rtl-all -fdump-tree-all -fdump-noaddr +c-torture-compile $src $option $options -dumpbase [pwd]/dump1/$dumpbase -DMASK=1 -x c --param ggc-min-heapsize=1 -fdump-ipa-all -fdump-rtl-all -fdump-tree-all -fdump-noaddr file delete -force dump2 file mkdir dump2 -c-torture-compile $src $option $options -dumpbase dump2/$dumpbase -DMASK=2 -x c -fdump-ipa-all -fdump-rtl-all -fdump-tree-all -fdump-noaddr +c-torture-compile $src $option $options -dumpbase [pwd]/dump2/$dumpbase -DMASK=2 -x c -fdump-ipa-all -fdump-rtl-all -fdump-tree-all -fdump-noaddr foreach dump1 [lsort [glob -nocomplain dump1/*]] { regsub dump1/ $dump1 dump2/ dump2 set dumptail gcc.c-torture/unsorted/[file tail $dump1] What I don't like about this approach is that dump1 and dump2 are created in the current working directory. On vxworks as I recall we did a cd to tmpdir, is that generally true? Also, if one telnets in or sshes into the host under test, the cd is mandatory... as otherwise one would dump turds (that's a technical term) in the home directory which would be very uncool. Maybe a better approach would be to cd to the right place if all the Canadian setups cd, as that then unifies them. With out of build-tree testing this may not (I believe) be the same as $tmpdir (where temporaries are normally created). Also the current directory may already contain directories/files called dump1 or dump2 which will get destroyed by running the The point of the cd was to get to a place where temps can be created freely... I've not committed my version yet in case I am missing something in my reasoning above with regards to the relationship between the current working directory and $tmpdir. So the question would be, does his patch work for you? It was unclear to me if the answer is no. Sorry - the patch works for my use case (build==host), but I was concerned over the use of [pwd] vs $tmpdir. Oh, wait, I know what I don't like about Andrew's patch, pwd, is that the directory on the target, the host or the build machine? And is that going to the host machine? They are not the same. One needs a directory on the host machine. I don't think this applies to my patch though, so are you still okay for my version to go in or is there something else I haven't considered? Thanks, Matt -- Matthew Gretton-Dann Principal Engineer, PD Software - Tools, ARM Ltd -- Matthew Gretton-Dann Principal Engineer, PD Software - Tools, ARM Ltd
Re: [PATCH] MIPS/libgcc: Add soft-fp support for SDE bare-iron targets
On Thu, 28 Jun 2012, Maciej W. Rozycki wrote: * config/mips/sfp-machine.h: New file. * config.host mips*-sde-elf*: Enable soft-fp. The compiler uses MIPS NaN conventions on MIPS; fp-bit knows about those but soft-fp does not. Are you not concerned about that regression? (Is this code only ever going to be used in software floating-point configurations, without exception support, so the choice of NaN doesn't matter much?) libgcc/config/mips/t-mips sets FPBIT and DPBIT. Shouldn't you do something to override those settings? Even if the libgcc logic is to build soft-fp if both soft-fp and fp-bit are configured, it would seem cleaner for the fragments to configure only the relevant one. -- Joseph S. Myers jos...@codesourcery.com
Re: [testsuite] don't use lto plugin if it doesn't work
On Jun 28, 2012, at 12:16 AM, Alexandre Oliva aol...@redhat.com wrote: On Jun 27, 2012, Mike Stump mikest...@comcast.net wrote: On Jun 27, 2012, at 2:07 AM, Alexandre Oliva wrote: Why? We don't demand a working plugin. Indeed, we disable the use of the plugin if we find a linker that doesn't support it. We just don't account for the possibility of finding a linker that supports plugins, but that doesn't support the one we'll build later. If this is the preferred solution, then having configure check the 64-bitness of ld and turning off the plugin altogether on mismatches sounds like a reasonable course of action to me. I'd very be surprised if I asked for an i686 native build to package and install elsewhere, and didn't get a plugin just because the build-time linker wouldn't have been able to run the plugin. The architecture of the compiler, last I knew it, was to smell out the feature set of the system, including libraries, headers, assemblers and linkers. It uses this as static configuration parameters for the build. One is not free to take the built compiler to a differently configured system at run time. Now, with that as a backdrop, how exactly do you ever plan on using the plugin? If there is no possible use for it, why then build it? So, even if there is a way to toggle the feature on, which would mean the plug-in should be built, it should still be off initially, which it isn't.
Re: [testsuite] don't use lto plugin if it doesn't work
On Jun 28, 2012, at 4:39 AM, Alexandre Oliva aol...@redhat.com wrote: On Jun 28, 2012, Jakub Jelinek ja...@redhat.com wrote: On Thu, Jun 28, 2012 at 04:16:55AM -0300, Alexandre Oliva wrote: I'd very be surprised if I asked for an i686 native build to package and install elsewhere, and didn't get a plugin just because the build-time linker wouldn't have been able to run the plugin. Not disable plugin support altogether, but disable assuming the linker supports the plugin. That still doesn't sound right to me: why should the compiler refrain from using a perfectly functional linker plugin on the machine where it's installed (not where it's built? See your point below for one reason. The next would be because it would be a speed hit to re-check at runtime the qualities of the linker and do something different. If the system had an architecture to avoid the speed hit and people wanted to do the work to support the runtime reconfigure, that'd be fine with me. I don't think you system supports this, and I don't think you want to do that work, do you? Also, this scenario of silently deciding whether or not to use the linker plugin could bring us to different test results for the same command lines. I don't like that. Right, which is why the static configuration of the host system at build time is forever after an invariant. The linker is smelled, it doesn't support plugins, therefore we can't ever use it, therefore we never build it...
Re: [PATCH] Add MULT_HIGHPART_EXPR
On Thu, Jun 28, 2012 at 09:17:55AM +0200, Jakub Jelinek wrote: I'll look at using MULT_HIGHPART_EXPR in the pattern recognizer and vectorizing it as either of the sequences next. And here is corresponding pattern recognizer and vectorizer patch. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? Unfortunately the addition of the builtin_mul_widen_* hooks on i?86 seems to pessimize the generated code for gcc.dg/vect/pr51581-3.c testcase (at least with -O3 -mavx) compared to when the hooks aren't present, because i?86 has more natural support for widen mult lo/hi compoared to widen mult even/odd, but I assume that on powerpc it is the other way around. So, how should I find out if both VEC_WIDEN_MULT_*_EXPR and builtin_mul_widen_* are possible for the particular vectype which one will be cheaper? 2012-06-28 Jakub Jelinek ja...@redhat.com PR tree-optimization/51581 * tree-vect-stmts.c (permute_vec_elements): Add forward decl. (vectorizable_operation): Handle vectorization of MULT_HIGHPART_EXPR also using VEC_WIDEN_MULT_*_EXPR or builtin_mul_widen_* plus VEC_PERM_EXPR if vector MULT_HIGHPART_EXPR isn't supported. * tree-vect-patterns.c (vect_recog_divmod_pattern): Use MULT_HIGHPART_EXPR instead of VEC_WIDEN_MULT_*_EXPR and shifts. * gcc.dg/vect/pr51581-4.c: New test. --- gcc/tree-vect-stmts.c.jj2012-06-26 11:38:28.0 +0200 +++ gcc/tree-vect-stmts.c 2012-06-28 13:27:50.475158271 +0200 @@ -3288,6 +3288,10 @@ vectorizable_shift (gimple stmt, gimple_ } +static tree permute_vec_elements (tree, tree, tree, gimple, + gimple_stmt_iterator *); + + /* Function vectorizable_operation. Check if STMT performs a binary, unary or ternary operation that can @@ -3300,17 +3304,18 @@ static bool vectorizable_operation (gimple stmt, gimple_stmt_iterator *gsi, gimple *vec_stmt, slp_tree slp_node) { - tree vec_dest; + tree vec_dest, vec_dest2 = NULL_TREE; + tree vec_dest3 = NULL_TREE, vec_dest4 = NULL_TREE; tree scalar_dest; tree op0, op1 = NULL_TREE, op2 = NULL_TREE; stmt_vec_info stmt_info = vinfo_for_stmt (stmt); - tree vectype; + tree vectype, wide_vectype = NULL_TREE; loop_vec_info loop_vinfo = STMT_VINFO_LOOP_VINFO (stmt_info); enum tree_code code; enum machine_mode vec_mode; tree new_temp; int op_type; - optab optab; + optab optab, optab2 = NULL; int icode; tree def; gimple def_stmt; @@ -3327,6 +3332,8 @@ vectorizable_operation (gimple stmt, gim tree vop0, vop1, vop2; bb_vec_info bb_vinfo = STMT_VINFO_BB_VINFO (stmt_info); int vf; + unsigned char *sel = NULL; + tree decl1 = NULL_TREE, decl2 = NULL_TREE, perm_mask = NULL_TREE; if (!STMT_VINFO_RELEVANT_P (stmt_info) !bb_vinfo) return false; @@ -3451,31 +3458,97 @@ vectorizable_operation (gimple stmt, gim optab = optab_for_tree_code (code, vectype, optab_default); /* Supportable by target? */ - if (!optab) + if (!optab code != MULT_HIGHPART_EXPR) { if (vect_print_dump_info (REPORT_DETAILS)) fprintf (vect_dump, no optab.); return false; } vec_mode = TYPE_MODE (vectype); - icode = (int) optab_handler (optab, vec_mode); + icode = optab ? (int) optab_handler (optab, vec_mode) : CODE_FOR_nothing; + + if (icode == CODE_FOR_nothing + code == MULT_HIGHPART_EXPR + VECTOR_MODE_P (vec_mode) + BYTES_BIG_ENDIAN == WORDS_BIG_ENDIAN) +{ + /* If MULT_HIGHPART_EXPR isn't supported by the backend, see +if we can emit VEC_WIDEN_MULT_{LO,HI}_EXPR followed by VEC_PERM_EXPR +or builtin_mul_widen_{even,odd} followed by VEC_PERM_EXPR. */ + unsigned int prec = TYPE_PRECISION (TREE_TYPE (scalar_dest)); + unsigned int unsignedp = TYPE_UNSIGNED (TREE_TYPE (scalar_dest)); + tree wide_type + = build_nonstandard_integer_type (prec * 2, unsignedp); + wide_vectype += get_same_sized_vectype (wide_type, vectype); + + sel = XALLOCAVEC (unsigned char, nunits_in); + if (VECTOR_MODE_P (TYPE_MODE (wide_vectype)) + GET_MODE_SIZE (TYPE_MODE (wide_vectype)) +== GET_MODE_SIZE (vec_mode)) + { + if (targetm.vectorize.builtin_mul_widen_even + (decl1 = targetm.vectorize.builtin_mul_widen_even (vectype)) + targetm.vectorize.builtin_mul_widen_odd + (decl2 = targetm.vectorize.builtin_mul_widen_odd (vectype)) + TYPE_MODE (TREE_TYPE (TREE_TYPE (decl1))) +== TYPE_MODE (wide_vectype)) + { + for (i = 0; i nunits_in; i++) + sel[i] = !BYTES_BIG_ENDIAN + (i ~1) ++ ((i 1) ? nunits_in : 0); + if (0 can_vec_perm_p (vec_mode, false, sel)) + icode = 0; + } + if (icode == CODE_FOR_nothing) + { + decl1 = NULL_TREE; + decl2 =
Re: [wwwdocs] Update coding conventions for C++
On Wed, 27 Jun 2012, Lawrence Crowl wrote: +h4a name=Namespace_UseNamespaces/a/h4 + +p +Namespaces are encouraged. +All separable libraries should have a unique global namespace. +All individual tools should have a unique global namespace. +Nested include directories names should map to nested namespaces when possible. +/p Do all people have a consensus on the use of namespace ? Well, we really only know about objections, and I have not seen any. I certainly think namespaces are a useful feature to use in GCC (with a namespace for the gcc/ directory, or as you imply separate ones for the driver and the compilers proper, one for libcpp, one for each front end, etc.). -- Joseph S. Myers jos...@codesourcery.com
Re: [testsuite] don't use lto plugin if it doesn't work
On Thu, Jun 28, 2012 at 07:03:37AM -0700, Mike Stump wrote: Also, this scenario of silently deciding whether or not to use the linker plugin could bring us to different test results for the same command lines. I don't like that. Right, which is why the static configuration of the host system at build time is forever after an invariant. The linker is smelled, it doesn't support plugins, therefore we can't ever use it, therefore we never build it... THis test is not about whether to build the plugin, but whether to force using it by default. And to be able to use it by default, you need a guarantee that all the linkers you'll use it with do support the plugin. Therefore, if the build-time linker doesn't support it, I think it is just fine not all of your linkers support the plugin and not enable it by default. Jakub
Re: [testsuite] don't use lto plugin if it doesn't work
On Thu, Jun 28, 2012 at 4:08 PM, Jakub Jelinek ja...@redhat.com wrote: On Thu, Jun 28, 2012 at 07:03:37AM -0700, Mike Stump wrote: Also, this scenario of silently deciding whether or not to use the linker plugin could bring us to different test results for the same command lines. I don't like that. Right, which is why the static configuration of the host system at build time is forever after an invariant. The linker is smelled, it doesn't support plugins, therefore we can't ever use it, therefore we never build it... THis test is not about whether to build the plugin, but whether to force using it by default. And to be able to use it by default, you need a guarantee that all the linkers you'll use it with do support the plugin. Therefore, if the build-time linker doesn't support it, I think it is just fine not all of your linkers support the plugin and not enable it by default. I'd like to have a more reliable way to enable/disable the default use of the linker-plugin then. Something in config.gcc maybe, or at least a flag I can specify at configure time. If the default in config.gcc is detected to not work then explicitely changing that (or confirming it) would be required - otherwise we'd error out. Richard.
Re: [Ada] Attribute 'Old should only be used in postconditions
Probably suppress both, since they no longer make sense (they are testing an early implementation of 'Old, before 'Old was standardized in Ada 2012). I'll take care of it. Thanks! -- Eric Botcazou
Re: [Ada] Attribute 'Old should only be used in postconditions
Probably suppress both, since they no longer make sense (they are testing an early implementation of 'Old, before 'Old was standardized in Ada 2012). I'll take care of it. Thanks! Sure, done for the record (revision 189042).
Re: [PATCH] Add MULT_HIGHPART_EXPR
On 2012-06-28 07:05, Jakub Jelinek wrote: Unfortunately the addition of the builtin_mul_widen_* hooks on i?86 seems to pessimize the generated code for gcc.dg/vect/pr51581-3.c testcase (at least with -O3 -mavx) compared to when the hooks aren't present, because i?86 has more natural support for widen mult lo/hi compoared to widen mult even/odd, but I assume that on powerpc it is the other way around. So, how should I find out if both VEC_WIDEN_MULT_*_EXPR and builtin_mul_widen_* are possible for the particular vectype which one will be cheaper? I would assume that if the builtin exists, then it is cheaper. I disagree about x86 has more natural support for hi/lo. The basic sse2 multiplication is even. One shift per input is needed to generate odd. On the other hand, one interleave per input is required for both hi/lo. So 4 setup insns for hi/lo, and 2 setup insns for even/odd. And on top of all that, XOP includes multiply odd at least for signed V4SI. I'll have a look at the test case you mention while I re-look at the patches... r~
Re: [PATCH, GCC][AArch64] Use Enums for code models option selection
Tejas Belagod wrote: Marcus Shawcroft wrote: On 13/06/12 14:38, Sofiane Naci wrote: Hi, I discovered a bug in my previous patch, so I attach a new one. The ChangeLog hasn't changed. OK to commit? Thanks Sofiane -Original Message- From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches-ow...@gcc.gnu.org] On Behalf Of Sofiane Naci Sent: 31 May 2012 10:55 To: gcc-patches@gcc.gnu.org Subject: [PATCH, GCC][AArch64] Use Enums for code models option selection Hi, This patch re-factors code models option selection in the AArch64 port: . Renaming variables such as mem_model to cmodel, for better clarity. . Using the generic support for enumerated option arguments. . Fixing touched code layout and formatting issues. Thanks Sofiane - ChangeLog: 2012-05-31 Sofiane Nacisofiane.n...@arm.com [AArch64] Use Enums for code models option selection. * config/aarch64/aarch64-elf-raw.h (AARCH64_DEFAULT_MEM_MODEL): Delete. * config/aarch64/aarch64-linux.h (AARCH64_DEFAULT_MEM_MODEL): Delete. * config/aarch64/aarch64-opts.h (enum aarch64_code_model): New. * config/aarch64/aarch64-protos.h: Update comments. * config/aarch64/aarch64.c: Update comments. (aarch64_default_mem_model): Rename to aarch64_code_model. (aarch64_expand_mov_immediate): Remove error message. (aarch64_select_rtx_section): Remove assertion and update comment. (aarch64_override_options): Move memory model initialization from here. (struct aarch64_mem_model): Delete. (aarch64_memory_models[]): Delete. (initialize_aarch64_memory_model): Rename to initialize_aarch64_code_model and update. (aarch64_classify_symbol): Handle AARCH64_CMODEL_TINY and AARCH64_CMODEL_TINY_PIC * config/aarch64/aarch64.h (enum aarch64_memory_model): Delete. (aarch64_default_mem_model): Rename to aarch64_cmodel. (HAS_LONG_COND_BRANCH): Update. (HAS_LONG_UNCOND_BRANCH): Update. * config/aarch64/aarch64.opt (cmodel): New. (mcmodel): Update. OK I've checked this in on aarch64-branch upstream for Sofiane. Tejas. Sorry, I broke the build when I applied this patch. Attached is a patch that fixes this. Build and regressions are happy. OK to commit? Thanks, Tejas Belagod. ARM. Changelog 2012-06-28 Tejas Belagod tejas.bela...@arm.com gcc/ * config/aarch64/aarch64.h (aarch64_cmodel): Fix enum name.diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h index ce2f899..5e24cd7 100644 --- a/gcc/config/aarch64/aarch64.h +++ b/gcc/config/aarch64/aarch64.h @@ -802,7 +802,7 @@ enum aarch64_builtins /* Check TLS Descriptors mechanism is selected. */ #define TARGET_TLS_DESC (aarch64_tls_dialect == TLS_DESCRIPTORS) -extern enum aarch64_memory_model aarch64_cmodel; +extern enum aarch64_code_model aarch64_cmodel; /* When using the tiny addressing model conditional and unconditional branches can span the whole of the available address space (1MB). */
[lra] trunk merged into the branch
I merged trunk at 188913 into lra branch. Some changes were required to make lra branch bootstrapped on x86/x86-64 and ppc. 2012-06-23 Vladimir Makarovvmaka...@redhat.com * lra.c (check_rtl): Add arg to insn_invalid_p call. * lra-assigns.c (init_regno_assign_info): Use ira_class_hard_regs_num instead of ira_available_class_regs. (reload_pseudo_compare_func): Ditto. * lra-constraints.c (extract_loc_address_regs): Set up disp_loc first. Transfer true for context_p only when base_reg_loc is defined. Add processing UNSPEC. (process_addr_reg): Reload always for non-reg. (equiv_address_substitution): Add arg to plus_constant calls. (curr_insn_transform): Don't process addresses for operators. Change duplication updates. (inherit_reload_reg): Use ira_class_hard_regs_num instead of ira_available_class_regs. * lra-eliminations.c (for_sum, lra_eliminate_regs_1): Add arg to plus_constant calls. (eliminate_regs_in_insn): Ditto. 2012-06-25 Vladimir Makarovvmaka...@redhat.com * output.h (alter_subreg): Add new argument. * sdbout.c (sdbout_symbol): Pass new argument to alter_subreg. * dbxout.c (dbxout_symbol_location): Ditto. * final.c (final_scan_insn, cleanup_subreg_operands): Ditto. (walk_alter_subreg, output_operand): Ditto. (alter_subreg): Add new argument. * emit-rtl.c (gen_rtx_REG): Add lra_in_progress. * config/rs6000/rs6000.c (rs6000_legitimate_offset_address_p): Always pass true to legitimate_constant_pool_address_p when lra_in_progress. (rs6000_legitimate_address_p): Ditto. * lra-int.h (lra_update_operator_dups): New. * lra.c (lra): Put lra_in_progress after lra_hard_reg_substitution. * lra-spills.c (lra_hard_reg_substitution): Pass new argument to alter_subreg. Call lra_update_operator_dups. * lra-eliminations.c (lra_eliminate_regs_1): Pass new argument to alter_subreg. * lra-constraints.c (simplify_operand_subreg): Ditto. (curr_insn_transform): Use lra_update_operator_dups.
Re: [PATCH] Add MULT_HIGHPART_EXPR
On Thu, Jun 28, 2012 at 08:57:23AM -0700, Richard Henderson wrote: On 2012-06-28 07:05, Jakub Jelinek wrote: Unfortunately the addition of the builtin_mul_widen_* hooks on i?86 seems to pessimize the generated code for gcc.dg/vect/pr51581-3.c testcase (at least with -O3 -mavx) compared to when the hooks aren't present, because i?86 has more natural support for widen mult lo/hi compoared to widen mult even/odd, but I assume that on powerpc it is the other way around. So, how should I find out if both VEC_WIDEN_MULT_*_EXPR and builtin_mul_widen_* are possible for the particular vectype which one will be cheaper? I would assume that if the builtin exists, then it is cheaper. I disagree about x86 has more natural support for hi/lo. The basic sse2 multiplication is even. One shift per input is needed to generate odd. On the other hand, one interleave per input is required for both hi/lo. So 4 setup insns for hi/lo, and 2 setup insns for even/odd. And on top of all that, XOP includes multiply odd at least for signed V4SI. Perhaps the problem is then that the permutation is much more expensive for even/odd. With even/odd the f2 routine is: vmovdqa d(%rip), %xmm2 vmovdqa .LC1(%rip), %xmm0 vpsrlq $32, %xmm2, %xmm4 vmovdqa d+16(%rip), %xmm1 vpmuludq%xmm0, %xmm2, %xmm5 vpsrlq $32, %xmm0, %xmm3 vpmuludq%xmm3, %xmm4, %xmm4 vpmuludq%xmm0, %xmm1, %xmm0 vmovdqa .LC2(%rip), %xmm2 vpsrlq $32, %xmm1, %xmm1 vpmuludq%xmm3, %xmm1, %xmm3 vmovdqa .LC3(%rip), %xmm1 vpshufb %xmm2, %xmm5, %xmm5 vpshufb %xmm1, %xmm4, %xmm4 vpshufb %xmm2, %xmm0, %xmm2 vpshufb %xmm1, %xmm3, %xmm1 vpor%xmm4, %xmm5, %xmm4 vpor%xmm1, %xmm2, %xmm1 vpsrld $1, %xmm4, %xmm4 vmovdqa %xmm4, c(%rip) vpsrld $1, %xmm1, %xmm1 vmovdqa %xmm1, c+16(%rip) ret and with lo/hi it is: vmovdqa d(%rip), %xmm2 vpunpckhdq %xmm2, %xmm2, %xmm3 vpunpckldq %xmm2, %xmm2, %xmm2 vmovdqa .LC1(%rip), %xmm0 vpmuludq%xmm0, %xmm3, %xmm3 vmovdqa d+16(%rip), %xmm1 vpmuludq%xmm0, %xmm2, %xmm2 vshufps $221, %xmm2, %xmm3, %xmm2 vpsrld $1, %xmm2, %xmm2 vmovdqa %xmm2, c(%rip) vpunpckhdq %xmm1, %xmm1, %xmm2 vpunpckldq %xmm1, %xmm1, %xmm1 vpmuludq%xmm0, %xmm2, %xmm2 vpmuludq%xmm0, %xmm1, %xmm0 vshufps $221, %xmm0, %xmm2, %xmm0 vpsrld $1, %xmm0, %xmm0 vmovdqa %xmm0, c+16(%rip) ret Jakub
Re: [PATCH] Add MULT_HIGHPART_EXPR
On Thu, Jun 28, 2012 at 8:57 AM, Richard Henderson r...@redhat.com wrote: On 2012-06-28 07:05, Jakub Jelinek wrote: Unfortunately the addition of the builtin_mul_widen_* hooks on i?86 seems to pessimize the generated code for gcc.dg/vect/pr51581-3.c testcase (at least with -O3 -mavx) compared to when the hooks aren't present, because i?86 has more natural support for widen mult lo/hi compoared to widen mult even/odd, but I assume that on powerpc it is the other way around. So, how should I find out if both VEC_WIDEN_MULT_*_EXPR and builtin_mul_widen_* are possible for the particular vectype which one will be cheaper? I would assume that if the builtin exists, then it is cheaper. I disagree about x86 has more natural support for hi/lo. The basic sse2 multiplication is even. One shift per input is needed to generate odd. On the other hand, one interleave per input is required for both hi/lo. So 4 setup insns for hi/lo, and 2 setup insns for even/odd. And on top of all that, XOP includes multiply odd at least for signed V4SI. I'll have a look at the test case you mention while I re-look at the patches... The upper 128-bit of 256-bit AVX instructions aren't a good fit with the current vectorizer infrastructure. -- H.J.
Re: [testsuite] gcc.dg/vect/vect-50.c: combine two scans
On 06/27/2012 05:05 PM, Mike Stump wrote: On Jun 27, 2012, at 3:36 PM, Janis Johnson wrote: These scans from gcc.dg/vect/vect-50.c, and others similar to them in other vect tests, hurt my brain: /* { dg-final { scan-tree-dump-times Vectorizing an unaligned access 2 vect { xfail { vect_no_align } } } } */ /* { dg-final { scan-tree-dump-times Vectorizing an unaligned access 2 vect { target vect_hw_misalign } } } */ Both of these PASS for i686-pc-linux-gnu, causing duplicate lines in the gcc test summary. I'm pretty sure the following accomplishes the same goal: /* { dg-final { scan-tree-dump-times Vectorizing an unaligned access 2 vect { xfail { vect_no_align { ! vect_hw_misalign } } } } } */ I don't think so? The first sets the xfail status for the testcase. If you change the condition, you can't the xfail state for some targets, which would be wrong (without a vec person chiming in). The two checks are run separately. The first one runs everywhere and is expected to fail for vect_no_align. The second is only run for vect_hw_misalign. Targets for which vect_no_align is false and vect_hw_misliang is true get two PASS reports. I'd like to think you can compose the two with some spelling... I just don't think this one is it.? No, there is no way to combine target and xfail, although since we intercept them we could presumably come up with a way to do that, with syntax and semantics we design. I grepped around and found: /* { dg-message does break strict-aliasing { target { *-*-* lp64 } xfail *-*-* } 8 } */ which might have the right way to spell it, though, I always test to ensure the construct does what I want. Nope. That should be flagged as an error by dg-message but it's passed through GCC's process-message which ignore errors (a bug) and simply ignores the directive. I'm currently trying a fix to not ignore errors from dg-error/dg-warning/dg-message and will then fix up the broken tests. That is, run the check everywhere We don't want to run the test on other than vect_hw_misalign targets, right? I don't know, but right now it's run everywhere at least once. Janis
Re: [PATCH, GCC][AArch64] Use Enums for code models option selection
On 28/06/12 16:58, Tejas Belagod wrote: Sorry, I broke the build when I applied this patch. Attached is a patch that fixes this. Build and regressions are happy. OK to commit? Thanks, Tejas Belagod. ARM. Changelog 2012-06-28 Tejas Belagod tejas.bela...@arm.com gcc/ * config/aarch64/aarch64.h (aarch64_cmodel): Fix enum name. OK. R.
Re: [PATCH] Add MULT_HIGHPART_EXPR
On 2012-06-28 09:20, Jakub Jelinek wrote: Perhaps the problem is then that the permutation is much more expensive for even/odd. With even/odd the f2 routine is: ... vpshufb %xmm2, %xmm5, %xmm5 vpshufb %xmm1, %xmm4, %xmm4 vpor%xmm4, %xmm5, %xmm4 ... and with lo/hi it is: vshufps $221, %xmm2, %xmm3, %xmm2 Hmm. That second has a reformatting delay. Last week when I pulled the mulv4si3 routine out to i386.c, I experimented with a few different options, including that interleave+shufps sequence seen here for lo/hi. See the comment there discussing options and timing. This also shows a deficiency in our vec_perm logic: 0L 0H 2L 2H 1L 1H 3L 3H 0H 2H 0H 2H 1H 3H 1H 3H 2*pshufd 0H 1H 2H 3H punpckldq without the permutation constants in memory. r~
Re: [PATCH] Add MULT_HIGHPART_EXPR
On 2012-06-28 07:05, Jakub Jelinek wrote: PR tree-optimization/51581 * tree-vect-stmts.c (permute_vec_elements): Add forward decl. (vectorizable_operation): Handle vectorization of MULT_HIGHPART_EXPR also using VEC_WIDEN_MULT_*_EXPR or builtin_mul_widen_* plus VEC_PERM_EXPR if vector MULT_HIGHPART_EXPR isn't supported. * tree-vect-patterns.c (vect_recog_divmod_pattern): Use MULT_HIGHPART_EXPR instead of VEC_WIDEN_MULT_*_EXPR and shifts. * gcc.dg/vect/pr51581-4.c: New test. Ok, except, + if (0 can_vec_perm_p (vec_mode, false, sel)) + icode = 0; Testing hack left in. r~
[C++ Pubnames Patch] Anonymous namespaces enclosed in named namespaces. (issue6343052)
The enclosed patch adds a fix for the pubnames anonymous namespaces contained within named namespaces, and adds an extensive test for the various pubnames. The bug is that when printing at verbosity level 1, and lang_decl_name sees a namespace decl in not in the global namespace, it prints the namespace's enclosing scopes--so far so good. However, the code I added earlier this month to handle anonymous namespaces also prints the enclosing scopes, so one would get foo::foo::(anonymous namespace) instead of foo::(anonymous namespace). The solution is to stop the added code from printing the enclosing scope, which is correct for both verbosity levels 0 and 1. Level 2 is handled elsewhere and so not relevant. I have formalized the tests I have been using to be sure pubnames are correct and include that in this patch. It is based on ccoutant's gdb_index_test.cc from the gold test suite. OK for mainline? Sterling gcc/cp/ChangeLog 2012-06-28 Sterling Augustine saugust...@google.com * error.c (lang_decl_name): Use TFF_UNQUALIFIED_NAME flag. gcc/testsuite/ChangeLog 2012-06-28 Sterling Augustine saugust...@google.com * g++.dg/debug/dwarf2/pubnames-2.C: New. Index: cp/error.c === --- cp/error.c (revision 189025) +++ cp/error.c (working copy) @@ -2633,7 +2633,7 @@ dump_function_name (decl, TFF_PLAIN_IDENTIFIER); else if ((DECL_NAME (decl) == NULL_TREE) TREE_CODE (decl) == NAMESPACE_DECL) -dump_decl (decl, TFF_PLAIN_IDENTIFIER); +dump_decl (decl, TFF_PLAIN_IDENTIFIER | TFF_UNQUALIFIED_NAME); else dump_decl (DECL_NAME (decl), TFF_PLAIN_IDENTIFIER); Index: testsuite/g++.dg/debug/dwarf2/pubnames-2.C === --- testsuite/g++.dg/debug/dwarf2/pubnames-2.C (revision 0) +++ testsuite/g++.dg/debug/dwarf2/pubnames-2.C (revision 0) @@ -0,0 +1,194 @@ +// { dg-do compile } +// { dg-options -gpubnames -gdwarf-4 -std=c++0x -dA } +// { dg-final { scan-assembler .section\t.debug_pubnames } } +// { dg-final { scan-assembler \\\(anonymous namespace\\)0\+\[ \t\]+\[#;]+\[ \t\]+external name } } +// { dg-final { scan-assembler \one0\+\[ \t\]+\[#;]+\[ \t\]+external name } } +// { dg-final { scan-assembler \one::G_A0\+\[ \t\]+\[#;]+\[ \t\]+external name } } +// { dg-final { scan-assembler \one::G_B0\+\[ \t\]+\[#;]+\[ \t\]+external name } } +// { dg-final { scan-assembler \one::G_C0\+\[ \t\]+\[#;]+\[ \t\]+external name } } +// { dg-final { scan-assembler \one::\\(anonymous namespace\\)0\+\[ \t\]+\[#;]+\[ \t\]+external name } } +// { dg-final { scan-assembler \two0\+\[ \t\]+\[#;]+\[ \t\]+external name } } +// { dg-final { scan-assembler \F_A0\+\[ \t\]+\[#;]+\[ \t\]+external name } } +// { dg-final { scan-assembler \F_B0\+\[ \t\]+\[#;]+\[ \t\]+external name } } +// { dg-final { scan-assembler \F_C0\+\[ \t\]+\[#;]+\[ \t\]+external name } } +// { dg-final { scan-assembler \inline_func_10\+\[ \t\]+\[#;]+\[ \t\]+external name } } +// { dg-final { scan-assembler \one::c1::c10\+\[ \t\]+\[#;]+\[ \t\]+external name } } +// { dg-final { scan-assembler \one::c1::~c10\+\[ \t\]+\[#;]+\[ \t\]+external name } } +// { dg-final { scan-assembler \one::c1::val0\+\[ \t\]+\[#;]+\[ \t\]+external name } } +// { dg-final { scan-assembler \check_enum0\+\[ \t\]+\[#;]+\[ \t\]+external name } } +// { dg-final { scan-assembler \main0\+\[ \t\]+\[#;]+\[ \t\]+external name } } +// { dg-final { scan-assembler \two::c2int::c20\+\[ \t\]+\[#;]+\[ \t\]+external name } } +// { dg-final { scan-assembler \two::c2double::c20\+\[ \t\]+\[#;]+\[ \t\]+external name } } +// { dg-final { scan-assembler \two::c2int const\\\*::c20\+\[ \t\]+\[#;]+\[ \t\]+external name } } +// { dg-final { scan-assembler \checkone::c10\+\[ \t\]+\[#;]+\[ \t\]+external name } } +// { dg-final { scan-assembler \checktwo::c2int \\0\+\[ \t\]+\[#;]+\[ \t\]+external name } } +// { dg-final { scan-assembler \checktwo::c2double \\0\+\[ \t\]+\[#;]+\[ \t\]+external name } } +// { dg-final { scan-assembler \checktwo::c2int const\\\* \\0\+\[ \t\]+\[#;]+\[ \t\]+external name } } +// { dg-final { scan-assembler \two::c2int::val0\+\[ \t\]+\[#;]+\[ \t\]+external name } } +// { dg-final { scan-assembler \two::c2double::val0\+\[ \t\]+\[#;]+\[ \t\]+external name } } +// { dg-final { scan-assembler \two::c2int const\\\*::val0\+\[ \t\]+\[#;]+\[ \t\]+external name } } +// { dg-final { scan-assembler \__static_initialization_and_destruction_00\+\[ \t\]+\[#;]+\[ \t\]+external name } } +// { dg-final { scan-assembler \two::c2int::~c20\+\[ \t\]+\[#;]+\[ \t\]+external name } } +// { dg-final { scan-assembler \two::c2double::~c20\+\[ \t\]+\[#;]+\[ \t\]+external name } } +// { dg-final { scan-assembler \two::c2int const\\\*::~c20\+\[ \t\]+\[#;]+\[ \t\]+external name } } +// { dg-final {
[lra] a patch to fix last testsuite regression on x86/x86-64
The following patch fixes last GCC testsuite regression (in comparison with reload) on x86/x86-64 after last merge of trunk into lra. The patch actually implements recent Bernd's optimization (restoring an argument pseudo value from the call result) in LRA. The patch was successfully bootstrapped on x86/x86-64. Committed as rev. 189051. 2012-06-28 Vladimir Makarov vmaka...@redhat.com * lra-constraints.c (inherit_in_ebb): Implement restoring argument pseudo value from the call result. Index: lra-constraints.c === --- lra-constraints.c (revision 189016) +++ lra-constraints.c (working copy) @@ -4344,7 +4344,7 @@ inherit_in_ebb (rtx head, rtx tail) { int i, src_regno, dst_regno; bool change_p, succ_p; - rtx prev_insn, next_usage_insns, set, first_insn, last_insn, next_insn; + rtx prev_insn, next_usage_insns, set, first_insn, last_insn, next_insn; enum reg_class cl; struct lra_insn_reg *reg; basic_block last_processed_bb, curr_bb = NULL; @@ -4354,7 +4354,6 @@ inherit_in_ebb (rtx head, rtx tail) bitmap_iterator bi; bool head_p, after_p; - change_p = false; curr_usage_insns_check++; reloads_num = calls_num = 0; @@ -4536,7 +4535,41 @@ inherit_in_ebb (rtx head, rtx tail) to_inherit[i].insns)) change_p = true; if (CALL_P (curr_insn)) - calls_num++; + { + rtx cheap, pat, dest, restore; + int regno, hard_regno; + + calls_num++; + if ((cheap = find_reg_note (curr_insn, + REG_RETURNED, NULL_RTX)) != NULL_RTX + ((cheap = XEXP (cheap, 0)), true) + (regno = REGNO (cheap)) = FIRST_PSEUDO_REGISTER + (hard_regno = reg_renumber[regno]) = 0 + /* If there are pending saves/restores, the + optimization is not worth. */ + usage_insns[regno].calls_num == calls_num - 1 + TEST_HARD_REG_BIT (call_used_reg_set, hard_regno)) + { + /* Restore the pseudo from the call result as + REG_RETURNED note says that the pseudo value is + in the call result and the pseudo is an argument + of the call. */ + pat = PATTERN (curr_insn); + if (GET_CODE (pat) == PARALLEL) + pat = XVECEXP (pat, 0, 0); + dest = SET_DEST (pat); + start_sequence (); + emit_move_insn (cheap, copy_rtx (dest)); + restore = get_insns (); + end_sequence (); + lra_process_new_insns (curr_insn, NULL, restore, + Inserting call parameter restore); + /* We don't need to save/restore of the pseudo from + this call. */ + usage_insns[regno].calls_num = calls_num; + bitmap_set_bit (check_only_regs, regno); + } + } to_inherit_num = 0; /* Process insn usages. */ for (reg = curr_id-regs; reg != NULL; reg = reg-next)
[PATCH][RFC, Reload]. Reload bug?
Hi, Attached is a fix for what seems to be a reload bug while handling subreg(mem...). I ran into this problem while implementing support for struct load/store in AArch64 using the standard patterns vec_loadstore_laneslarge_int_modevec_mode on the same lines of the ARM backend. The test case that caused the issue was: void SexiALI_Convert(void *vdest, void *vsrc, unsigned int frames, int n) { unsigned int x; short *src = vsrc; unsigned char *dest = vdest; for(x=0;x256;x++) { int tmp; tmp = *src; src++; tmp += *src; src++; *dest++ = tmp; } } Before reload, this is the RTL dump I see: . (insn 110 114 111 4 (set (reg:V8HI 158 [ vect_var_.21 ]) (subreg:V8HI (reg:OI 530 [ vect_array.20 ]) 0)) ice.i:9 512 {*aarch64_simd_movv8hi} (nil)) (insn 111 110 115 4 (set (reg:V8HI 159 [ vect_var_.22 ]) (subreg:V8HI (reg:OI 530 [ vect_array.20 ]) 16)) ice.i:9 512 {*aarch64_simd_movv8hi} (expr_list:REG_DEAD (reg:OI 530 [ vect_array.20 ]) (nil))) (insn 115 111 116 4 (set (reg:V8HI 161 [ vect_var_.24 ]) (subreg:V8HI (reg:OI 529 [ vect_array.23 ]) 0)) ice.i:9 512 {*aarch64_simd_movv8hi} (nil)) (insn 116 115 117 4 (set (reg:V8HI 162 [ vect_var_.25 ]) (subreg:V8HI (reg:OI 529 [ vect_array.23 ]) 16)) ice.i:9 512 {*aarch64_simd_movv8hi} (expr_list:REG_DEAD (reg:OI 529 [ vect_array.23 ]) (nil))) (insn 117 116 118 4 (set (reg:V4SI 544 [ vect_var_.27 ]) (sign_extend:V4SI (vec_select:V4HI (reg:V8HI 159 [ vect_var_.22 ]) (parallel:V8HI [ (const_int 0 [0]) (const_int 1 [0x1]) (const_int 2 [0x2]) (const_int 3 [0x3]) ] ice.i:11 700 {aarch64_simd_vec_unpacks_lo_v8hi} (nil)) (insn 118 117 124 4 (set (reg:V4SI 545 [ vect_var_.26 ]) (sign_extend:V4SI (vec_select:V4HI (reg:V8HI 158 [ vect_var_.21 ]) (parallel:V8HI [ (const_int 0 [0]) (const_int 1 [0x1]) (const_int 2 [0x2]) (const_int 3 [0x3]) ] ice.i:9 700 {aarch64_simd_vec_unpacks_lo_v8hi} (nil)) . In insn 116, reg_equiv_mem () of the psuedoreg 529 is (mem:OI (reg sp)), and the subreg is equivalent to: subreg:V8HI (mem:OI (reg sp) 16) which does not get folded into mem:V8HI (plus:DI (reg sp) (const_int 16)) because, in reload.c:find_reloads_toplev () where such subregs are narrowed into narower memrefs, the memref supplied to strict_memory_address_addr_space_P () is just (mem:OI (reg sp)) and the SUBREG_BYTE is forgotten. Therefore strict_memory_address_addr_space_P () thinks that (mem:OI (reg sp)) is a valid target address and lets it pass as a subreg and does not narrow the subreg into a narrower memref. find_reloads_toplev () should have infact given strict_memory_address_addr_space_P () (mem:OI (plus:DI (reg sp) (const_int 16)) ) which will be returned as false as base+offset is invalid for NEON addressing modes and this will be reloaded into a narrower memref. Also, I tried writing a secondary reload for this, but at no time is the RTL (subreg:V8HI (mem:OI (reg sp)) 16) available to the target secondary reload for it to fix it up. Therefore, I've fixed find_reloads_toplev () to pass the full address to strict_memory_address_addr_space_P () in the case of subregs. Does this look like a sane fix? I've tested this patch on arm-none-eabi and bootstrapped on x86_64-pc-linux and all is well. Thanks, Tejas Belagod. ARM. Changelog: 2012-06-28 Tejas Belagod tejas.bela...@arm.com gcc/ * reload.c (find_reloads_toplev): Include the subreg byte in the address of memrefs when converting subregs of mems into narrower memrefs.diff --git a/gcc/reload.c b/gcc/reload.c index e42cc5c..b6d4ce9 100644 --- a/gcc/reload.c +++ b/gcc/reload.c @@ -4771,15 +4771,27 @@ find_reloads_toplev (rtx x, int opnum, enum reload_type type, #ifdef LOAD_EXTEND_OP !paradoxical_subreg_p (x) #endif - (reg_equiv_address (regno) != 0 - || (reg_equiv_mem (regno) != 0 - (! strict_memory_address_addr_space_p - (GET_MODE (x), XEXP (reg_equiv_mem (regno), 0), - MEM_ADDR_SPACE (reg_equiv_mem (regno))) - || ! offsettable_memref_p (reg_equiv_mem (regno)) - || num_not_at_initial_offset - x = find_reloads_subreg_address (x, 1, opnum, type, ind_levels, - insn, address_reloaded); +) + { + if (reg_equiv_address (regno) != 0) + x = find_reloads_subreg_address (x, 1, opnum, type, ind_levels, +insn, address_reloaded); + else if (reg_equiv_mem (regno) != 0) + { + tem = +
Re: [RFA] Enable dump-noaddr test to work in out of build tree testing
On Thu, Jun 28, 2012 at 6:50 AM, Matthew Gretton-Dann matthew.gretton-d...@arm.com wrote: On 28/06/12 14:38, Mike Stump wrote: On Jun 28, 2012, at 1:28 AM, Matthew Gretton-Dann matthew.gretton-d...@arm.com wrote: On 27/06/12 21:35, Andrew Pinski wrote: On Wed, Jun 27, 2012 at 3:33 AM, Matthew Gretton-Dann matthew.gretton-d...@arm.com wrote: All, This patch enables the dump-noaddr test to work in out-of-build-tree testing. [snip] I created a much simpler patch which I have been meaning to submit. I attached it for reference. Thanks, Andrew Pinski ChangeLog: * testsuite/gcc.c-torture/unsorted/dump-noaddr.x (dump_compare): Use an absolute dump base instead of a relative one. Index: gcc.c-torture/unsorted/dump-noaddr.x === --- gcc.c-torture/unsorted/dump-noaddr.x (revision 61452) +++ gcc.c-torture/unsorted/dump-noaddr.x (revision 61453) @@ -11,10 +11,10 @@ proc dump_compare { src options } { foreach option $option_list { file delete -force dump1 file mkdir dump1 - c-torture-compile $src $option $options -dumpbase dump1/$dumpbase -DMASK=1 -x c --param ggc-min-heapsize=1 -fdump-ipa-all -fdump-rtl-all -fdump-tree-all -fdump-noaddr + c-torture-compile $src $option $options -dumpbase [pwd]/dump1/$dumpbase -DMASK=1 -x c --param ggc-min-heapsize=1 -fdump-ipa-all -fdump-rtl-all -fdump-tree-all -fdump-noaddr file delete -force dump2 file mkdir dump2 - c-torture-compile $src $option $options -dumpbase dump2/$dumpbase -DMASK=2 -x c -fdump-ipa-all -fdump-rtl-all -fdump-tree-all -fdump-noaddr + c-torture-compile $src $option $options -dumpbase [pwd]/dump2/$dumpbase -DMASK=2 -x c -fdump-ipa-all -fdump-rtl-all -fdump-tree-all -fdump-noaddr foreach dump1 [lsort [glob -nocomplain dump1/*]] { regsub dump1/ $dump1 dump2/ dump2 set dumptail gcc.c-torture/unsorted/[file tail $dump1] What I don't like about this approach is that dump1 and dump2 are created in the current working directory. On vxworks as I recall we did a cd to tmpdir, is that generally true? Also, if one telnets in or sshes into the host under test, the cd is mandatory... as otherwise one would dump turds (that's a technical term) in the home directory which would be very uncool. Maybe a better approach would be to cd to the right place if all the Canadian setups cd, as that then unifies them. With out of build-tree testing this may not (I believe) be the same as $tmpdir (where temporaries are normally created). Also the current directory may already contain directories/files called dump1 or dump2 which will get destroyed by running the The point of the cd was to get to a place where temps can be created freely... I've not committed my version yet in case I am missing something in my reasoning above with regards to the relationship between the current working directory and $tmpdir. So the question would be, does his patch work for you? It was unclear to me if the answer is no. Sorry - the patch works for my use case (build==host), but I was concerned over the use of [pwd] vs $tmpdir. Both will work in the case of build==host. I don't even know if we really support build!=host testing at all. I have never seen it done and I have no idea how to control it via dejagnu. Has anyone tested build!=host recently? Thanks, Andrew Pinski Oh, wait, I know what I don't like about Andrew's patch, pwd, is that the directory on the target, the host or the build machine? And is that going to the host machine? They are not the same. One needs a directory on the host machine. I don't think this applies to my patch though, so are you still okay for my version to go in or is there something else I haven't considered? Thanks, Matt -- Matthew Gretton-Dann Principal Engineer, PD Software - Tools, ARM Ltd -- Matthew Gretton-Dann Principal Engineer, PD Software - Tools, ARM Ltd
Re: [RFA] Enable dump-noaddr test to work in out of build tree testing
On Jun 28, 2012, at 11:42 AM, Andrew Pinski wrote: Both will work in the case of build==host. I don't even know if we really support build!=host testing at all. Sure... works just fine, last I knew. Generally easy enough to fixup, if people get it wrong. I have never seen it done and I have no idea how to control it via dejagnu. Has anyone tested build!=host recently? Be curious to know if people do this anymore. Host testing a lame OS, like MS-DOS... was why it was put in.
[PATCH] Fix PR46556 (straight-line strength reduction, part 2)
Here's a relatively small piece of strength reduction that solves that pesky addressing bug that got me looking at this in the first place... The main part of the code is the stuff that was reviewed last year, but which needed to find a good home. So hopefully that's in pretty good shape. I recast base_cand_map as an htab again since I now need to look up trees other than SSA names. I plan to put together a follow-up patch to change code and commentary references so that base_name becomes base_expr. Doing that now would clutter up the patch too much. Bootstrapped and tested on powerpc64-linux-gnu with no new regressions. Ok for trunk? Thanks, Bill gcc: PR tree-optimization/46556 * gimple-ssa-strength-reduction.c (enum cand_kind): Add CAND_REF. (base_cand_map): Change to hash table. (base_cand_hash): New function. (base_cand_free): Likewise. (base_cand_eq): Likewise. (lookup_cand): Change base_cand_map to hash table. (find_basis_for_candidate): Likewise. (base_cand_from_table): Exclude CAND_REF. (restructure_reference): New function. (slsr_process_ref): Likewise. (find_candidates_in_block): Call slsr_process_ref. (dump_candidate): Handle CAND_REF. (base_cand_dump_callback): New function. (dump_cand_chains): Change base_cand_map to hash table. (replace_ref): New function. (replace_refs): Likewise. (analyze_candidates_and_replace): Call replace_refs. (execute_strength_reduction): Change base_cand_map to hash table. gcc/testsuite: PR tree-optimization/46556 * testsuite/gcc.dg/tree-ssa/slsr-27.c: New. * testsuite/gcc.dg/tree-ssa/slsr-28.c: New. * testsuite/gcc.dg/tree-ssa/slsr-29.c: New. Index: gcc/testsuite/gcc.dg/tree-ssa/slsr-27.c === --- gcc/testsuite/gcc.dg/tree-ssa/slsr-27.c (revision 0) +++ gcc/testsuite/gcc.dg/tree-ssa/slsr-27.c (revision 0) @@ -0,0 +1,22 @@ +/* { dg-do compile } */ +/* { dg-options -O2 -fdump-tree-dom2 } */ + +struct x +{ + int a[16]; + int b[16]; + int c[16]; +}; + +extern void foo (int, int, int); + +void +f (struct x *p, unsigned int n) +{ + foo (p-a[n], p-c[n], p-b[n]); +} + +/* { dg-final { scan-tree-dump-times \\* 4; 1 dom2 } } */ +/* { dg-final { scan-tree-dump-times p_\\d\+\\(D\\) \\+ D 1 dom2 } } */ +/* { dg-final { scan-tree-dump-times MEM\\\[\\(struct x \\*\\)D 3 dom2 } } */ +/* { dg-final { cleanup-tree-dump dom2 } } */ Index: gcc/testsuite/gcc.dg/tree-ssa/slsr-28.c === --- gcc/testsuite/gcc.dg/tree-ssa/slsr-28.c (revision 0) +++ gcc/testsuite/gcc.dg/tree-ssa/slsr-28.c (revision 0) @@ -0,0 +1,26 @@ +/* { dg-do compile } */ +/* { dg-options -O2 -fdump-tree-dom2 } */ + +struct x +{ + int a[16]; + int b[16]; + int c[16]; +}; + +extern void foo (int, int, int); + +void +f (struct x *p, unsigned int n) +{ + foo (p-a[n], p-c[n], p-b[n]); + if (n 12) +foo (p-a[n], p-c[n], p-b[n]); + else if (n 3) +foo (p-b[n], p-a[n], p-c[n]); +} + +/* { dg-final { scan-tree-dump-times \\* 4; 1 dom2 } } */ +/* { dg-final { scan-tree-dump-times p_\\d\+\\(D\\) \\+ D 1 dom2 } } */ +/* { dg-final { scan-tree-dump-times MEM\\\[\\(struct x \\*\\)D 9 dom2 } } */ +/* { dg-final { cleanup-tree-dump dom2 } } */ Index: gcc/testsuite/gcc.dg/tree-ssa/slsr-29.c === --- gcc/testsuite/gcc.dg/tree-ssa/slsr-29.c (revision 0) +++ gcc/testsuite/gcc.dg/tree-ssa/slsr-29.c (revision 0) @@ -0,0 +1,28 @@ +/* { dg-do compile } */ +/* { dg-options -O2 -fdump-tree-dom2 } */ + +struct x +{ + int a[16]; + int b[16]; + int c[16]; +}; + +extern void foo (int, int, int); + +void +f (struct x *p, unsigned int n) +{ + foo (p-a[n], p-c[n], p-b[n]); + if (n 3) +{ + foo (p-a[n], p-c[n], p-b[n]); + if (n 12) + foo (p-b[n], p-a[n], p-c[n]); +} +} + +/* { dg-final { scan-tree-dump-times \\* 4; 1 dom2 } } */ +/* { dg-final { scan-tree-dump-times p_\\d\+\\(D\\) \\+ D 1 dom2 } } */ +/* { dg-final { scan-tree-dump-times MEM\\\[\\(struct x \\*\\)D 9 dom2 } } */ +/* { dg-final { cleanup-tree-dump dom2 } } */ Index: gcc/gimple-ssa-strength-reduction.c === --- gcc/gimple-ssa-strength-reduction.c (revision 189025) +++ gcc/gimple-ssa-strength-reduction.c (working copy) @@ -32,7 +32,7 @@ along with GCC; see the file COPYING3. If not see 2) Explicit multiplies, unknown constant multipliers, no conditional increments. (data gathering complete, replacements pending) - 3) Implicit multiplies in addressing expressions. (pending) + 3) Implicit multiplies in addressing expressions. (complete) 4) Explicit multiplies, conditional increments. (pending) It would also be possible to apply strength
Re: [wwwdocs] Update coding conventions for C++
On 6/27/12, Lawrence Crowl cr...@google.com wrote: ..., does anyone object to removing the permission to use C++ streams? Having heard no objection, I removed the permission. The following patch is the current state of the changes. Since the discussion appears to have died down, can I commit this patch? BTW, as before, I have removed the html tags from this patch, as they cause the mail server to reject the patch. Index: htdocs/codingconventions.html === RCS file: /cvs/gcc/wwwdocs/htdocs/codingconventions.html,v retrieving revision 1.66 diff -u -u -r1.66 codingconventions.html --- htdocs/codingconventions.html 19 Feb 2012 00:45:34 - 1.66 +++ htdocs/codingconventions.html 28 Jun 2012 22:03:38 - @@ -15,8 +19,73 @@ code to follow these conventions, it is best to send changes to follow the conventions separately from any other changes to the code./p +ul +lia href=#DocumentationDocumentation/a/li +lia href=#ChangeLogsChangeLogs/a/li +lia href=#PortabilityPortability/a/li +lia href=#MakefilesMakefiles/a/li +lia href=#TestsuiteTestsuite Conventions/a/li +lia href=#DiagnosticsDiagnostics Conventions/a/li +lia href=#SpellingSpelling, terminology and markup/a/li +lia href=#CandCxxC and C++ Language Conventions/a +ul +lia href=#C_OptionsCompiler Options/a/li +lia href=#C_LanguageLanguage Use/a +ul +lia href=#AssertionsAssertions/a/li +lia href=#CharacterCharacter Testing/a/li +lia href=#ErrorError Node Testing/a/li +lia href=#GeneratedParameters Affecting Generated Code/a/li +lia href=#C_InliningInlining Functions/a/li +/ul +/li +lia href=#C_FormattingFormatting Conventions/a +ul +lia href=#LineLine Length/a/li +lia href=#C_NamesNames/a/li +lia href=#ExpressionsExpressions/a/li +/ul +/li +/ul +/li +lia href=#Cxx_ConventionsC++ Language Conventions/a +ul +lia href=#Cxx_LanguageLanguage Use/a +ul +lia href=#VariableVariable Definitions/a/li +lia href=#Struct_UseStruct Definitions/a/li +lia href=#Class_UseClass Definitions/a/li +lia href=#ConstructorsConstructors and Destructors/a/li +lia href=#ConversionsConversions/a/li +lia href=#Over_FuncOverloading Functions/a/li +lia href=#Over_OperOverloading Operators/a/li +lia href=#DefaultDefault Arguments/a/li +lia href=#Cxx_InliningInlining Functions/a/li +lia href=#Template_UseTemplates/a/li +lia href=#Namespace_UseNamespaces/a/li +lia href=#RTTIRTTI and codedynamic_cast/code/a/li +lia href=#CastsOther Casts/a/li +lia href=#ExceptionsExceptions/a/li +lia href=#Standard_LibraryThe Standard Library/a/li +/ul +/li +lia href=#Cxx_FormattingFormatting Conventions/a +ul +lia href=#Cxx_NamesNames/a/li +lia href=#Struct_FormStruct Definitions/a/li +lia href=#Class_FormClass Definitions/a/li +lia href=#Member_FormClass Member Definitions/a/li +lia href=#Template_FormTemplates/a/li +lia href=#ExternCExtern C/a/li +lia href=#Namespace_FormNamespaces/a/li +/ul +/li +/ul +/li +/ul -h2Documentation/h2 + +h2a name=DocumentationDocumentation/a/h2 pDocumentation, both of user interfaces and of internals, must be maintained and kept up to date. In particular:/p @@ -43,7 +112,7 @@ /ul -h2ChangeLogs/h2 +h2a name=ChangeLogsChangeLogs/a/h2 pGCC requires ChangeLog entries for documentation changes; for the web pages (apart from codejava//code and codelibstdc++//code) the CVS @@ -71,20 +140,40 @@ codejava/58/code is the actual number of the PR) at the top of the ChangeLog entry./p -h2Portability/h2 +h2a name=PortabilityPortability/a/h2 pThere are strict requirements for portability of code in GCC to -older systems whose compilers do not implement all of the ISO C standard. -GCC requires at least an ANSI C89 or ISO C90 host compiler, and code -should avoid pre-standard style function definitions, unnecessary -function prototypes and use of the now deprecated @code{PARAMS} macro. +older systems whose compilers do not implement all of the +latest ISO C and C++ standards. +/p + +p +The directories +codegcc/code, codelibcpp/code and codefixincludes/code +may use C++03. +They may also use the codelong long/code type +if the host C++ compiler supports it. +These directories should use reasonably portable parts of C++03, +so that it is possible to build GCC with C++ compilers other than GCC itself. +If testing reveals that +reasonably recent versions of non-GCC C++ compilers cannot compile GCC, +then GCC code should be adjusted accordingly. +(Avoiding unusual language constructs helps immensely.) +Furthermore, +these directories emshould/em also be compatible with C++11. +/p + +p +The directories libiberty and libdecnumber must use C +and
Re: [testsuite] don't use lto plugin if it doesn't work
On Jun 28, 2012, Mike Stump mikest...@comcast.net wrote: On Jun 28, 2012, at 4:39 AM, Alexandre Oliva aol...@redhat.com wrote: That still doesn't sound right to me: why should the compiler refrain from using a perfectly functional linker plugin on the machine where it's installed (not where it's built? See your point below for one reason. My point below suggests a reason for us to *verbosely* indicate the change, e.g., in the test command line, like my patch does. The next would be because it would be a speed hit to re-check at runtime the qualities of the linker and do something different. But then, our testsuite *does* re-check at runtime, but without my patch, we're not using completely the result of the test. If the system had an architecture to avoid the speed hit and people wanted to do the work to support the runtime reconfigure, that'd be fine with me. Me too, but I'm not arguing for or against that. I'm just arguing for a change to the test harness that will use the result of the dynamic test, and verbosely so. Also, this scenario of silently deciding whether or not to use the linker plugin could bring us to different test results for the same command lines. I don't like that. Right, which is why the static configuration of the host system at build time is forever after an invariant. That doesn't even match *current* reality. We can run the testsuite on a machine that's neither the build system nor the run-time target. That's presumably why the test harness tests whether the plugin works. And that's one reason why we should use that result instead of letting the compiler override it. The linker is smelled, it doesn't support plugins, therefore we can't ever use it, therefore we never build it... 'cept even in the build system it *does* support plugins, so it's just reasonable for us to build the plugin, and for the compiler to expect to be able to use it. Now, this will work just fine if the compiler is installed on a system that matches the host=target (i.e., native compiler) triplet specified when building the compiler. It might not work on the build machine, but that's irrelevant, for we're not supposed to be able to use the compiler on the build machine. It might not work on the test machine, and that's why the test harness tests for plugin support. But the test harness doesn't communicate back to the compiler its findings without my patch, so if the test system doesn't happen to support plugins, we'd get tons of pointless failures. If we change the compiler configuration so that it disables the plugin just because it guesses some potential incompatibility between the linker and the plugin we're about to build, we'll lose features and testing. If we change the compiler to detect it dynamically, we'll get ambiguous test results. “did this -flto test use the plugin or not?” Why would you want any of the scenarios in the two paragraphs above? If you wouldn't, what do you have against the patch that complements the plugin detection on the test machine in the test harness? -- Alexandre Oliva, freedom fighterhttp://FSFLA.org/~lxoliva/ You must be the change you wish to see in the world. -- Gandhi Be Free! -- http://FSFLA.org/ FSF Latin America board member Free Software Evangelist Red Hat Brazil Compiler Engineer
Re: [Patch, libgfortran] Add FPU Support for powerpc
On Tue, May 22, 2012 at 3:45 AM, rbmj r...@verizon.net wrote: Hi everyone, This patch adds FPU support for powerpc on platforms that do not have glibc. It is basically the same code as glibc has. The motivation for this was that right now there is no fpu-target.h that works for powerpc-*-vxworks. Again, 90% of this code comes directly from glibc. But on vxworks targets there is no glibc. I also patched the configure.host script in order to add this in. Any opinions? Since AFAICT nobody has responded... I suppose this is something you need, or you would probably not be working on it. I wouldn't have thought of VxWorks as an obvious target platform for a Fortran compiler. :-) The copying of the code from glibc (LGPL code) to libgfortran (GPL+exception) is something that you probably need permission for from the FSF. For the VxWorks specific bits, you could poke the only listed VxWorks maintainer in MAINTAINERS (hi Nathan!). For the configure.host bits, + powerpc) Not powerpc64? Or at least powerpc|ppc? IIUC this test is overridden for powerpc-linux by a glibc test following your new code, right? What happens for e.g. powerpc-aix? Shouldn't your test also be conditional on have_feenableexcept? Ciao! Steven
Re: [PATCH] gfortran testsuite: implicitly cleanup-modules
Rehi Janis, Good to see you active again :) Perhaps you want to pursue this? We'd need to suggest this to dejagnu, have it in a release and bump the minimum required deja version of gcc. So it may take time but IMO would be a worthwhile cleanup. Or do you see a better way to handle this properly? The first patch below is the dejagnu part, the other patch is the corresponding follow-up for gcc. cheers, Bernhard On Fri, Mar 16, 2012 at 03:59:58PM +0100, Bernhard Reutner-Fischer wrote: On Fri, Mar 16, 2012 at 11:04:45AM +0100, Bernhard Reutner-Fischer wrote: The underlying problem is that dejagnu's runtest.exp only allows for a single libdir where it searches for includes -- see comment in libgomp.exp and libitm.exp While just adding more and more load_gcc_lib calls to users outside of gcc/ is the easy way out, it is (IMHO) error prone (i ran make check just in gcc and not in toplevel, fixed my script now). It would be desirable if dejagnu would just find all the currently load_gcc_lib'ed files on its own, via load_lib. One could - teach dejagnu to treat libdir as a list of paths The attached works for me for a toplevel make -k check (double-checked with individual make check in lib{gomp,itm}). I do not intend to pursue this any further. runtest.exp: add libdirs list for load_lib() libgomp wants to load .exp files from ../gcc/testsuite/lib. Instrument load_lib to be able to find the files. Previously we used to have a helper proc that had to first load all dependent .exp manually and then, again manually, the desired .exp. 2012-03-16 Bernhard Reutner-Fischer al...@gcc.gnu.org * runtest.exp (libdirs): New global list. (load_lib): Append libdirs to search_and_load_files directories. diff --git a/runtest.exp b/runtest.exp index 4bfed83..8e6a7de 100644 --- a/runtest.exp +++ b/runtest.exp @@ -589,7 +589,7 @@ proc lookfor_file { dir name } { # source tree, (up one or two levels), then in the current dir. # proc load_lib { file } { -global verbose libdir srcdir base_dir execpath tool +global verbose libdir libdirs srcdir base_dir execpath tool global loaded_libs if {[info exists loaded_libs($file)]} { @@ -597,8 +597,11 @@ proc load_lib { file } { } set loaded_libs($file) - -if { [search_and_load_file library file $file [list ../lib $libdir $libdir/lib [file dirname [file dirname $srcdir]]/dejagnu/lib $srcdir/lib $execpath/lib . [file dirname [file dirname [file dirname $srcdir]]]/dejagnu/lib]] == 0 } { +set search_dirs [list ../lib $libdir $libdir/lib [file dirname [file dirname $srcdir]]/dejagnu/lib $srcdir/lib $execpath/lib . [file dirname [file dirname [file dirname $srcdir]]]/dejagnu/lib] +if {[info exists libdirs]} { +lappend search_dirs $libdirs +} +if { [search_and_load_file library file $file $search_dirs ] == 0 } { send_error ERROR: Couldn't find library file $file.\n exit 1 } @@ -652,6 +655,8 @@ set libdir [file dirname $execpath]/dejagnu if {[info exists env(DEJAGNULIBS)]} { set libdir $env(DEJAGNULIBS) } +# list of extra directories for load_lib +set libdirs {} verbose Using $libdir to find libraries libgomp/ChangeLog 2012-03-16 Bernhard Reutner-Fischer al...@gcc.gnu.org * testsuite/lib/libgomp.exp: Set libdirs. Remove now redundant manual inclusion of gfortran-dg's dependencies. libitm/ChangeLog 2012-03-16 Bernhard Reutner-Fischer al...@gcc.gnu.org * testsuite/lib/libitm.exp: Set libdirs. Remove now redundant manual inclusion of gcc-dg's dependencies. diff --git a/libgomp/testsuite/lib/libgomp.exp b/libgomp/testsuite/lib/libgomp.exp index 02909f8..54e1e652 100644 --- a/libgomp/testsuite/lib/libgomp.exp +++ b/libgomp/testsuite/lib/libgomp.exp @@ -1,32 +1,12 @@ -# Damn dejagnu for not having proper library search paths for load_lib. -# We have to explicitly load everything that gcc-dg.exp wants to load. +global libdirs +lappend libdirs $srcdir/../../gcc/testsuite/lib -proc load_gcc_lib { filename } { -global srcdir loaded_libs +load_lib dg.exp -load_file $srcdir/../../gcc/testsuite/lib/$filename -set loaded_libs($filename) -} +# BUG: gcc-dg calls gcc-set-multilib-library-path but does not load gcc-defs! +load_lib gcc-defs.exp -load_lib dg.exp -load_gcc_lib file-format.exp -load_gcc_lib target-supports.exp -load_gcc_lib target-supports-dg.exp -load_gcc_lib scanasm.exp -load_gcc_lib scandump.exp -load_gcc_lib scanrtl.exp -load_gcc_lib scantree.exp -load_gcc_lib scanipa.exp -load_gcc_lib prune.exp -load_gcc_lib target-libpath.exp -load_gcc_lib wrapper.exp -load_gcc_lib gcc-defs.exp -load_gcc_lib torture-options.exp -load_gcc_lib timeout.exp -load_gcc_lib timeout-dg.exp -load_gcc_lib fortran-modules.exp -load_gcc_lib gcc-dg.exp -load_gcc_lib gfortran-dg.exp +load_lib gfortran-dg.exp set dg-do-what-default run diff --git a/libitm/testsuite/lib/libitm.exp b/libitm/testsuite/lib/libitm.exp index f322ed5..1ac8f31
Fwd: [Bug debug/53754] [4.8 Regression][lto] ICE in lhd_decl_printable_name, at langhooks.c:222 (with -g)
[resending in plain text. Sorry, gmail defaulted to HTML.] Ping. I'm not looking for commit approval yet, just advice on how thorough we need to be to support -g and LTO together. (What's the right way to send a patch to fix a PR? I'm not even sure whether you were cc'ed on my response.) -cary -- Forwarded message -- From: ccoutant at gcc dot gnu.org gcc-bugzi...@gcc.gnu.org Date: Mon, Jun 25, 2012 at 2:19 PM Subject: [Bug debug/53754] [4.8 Regression][lto] ICE in lhd_decl_printable_name, at langhooks.c:222 (with -g) To: ccout...@google.com http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53754 Cary Coutant ccoutant at gcc dot gnu.org changed: What |Removed |Added Status|NEW |ASSIGNED AssignedTo|unassigned at gcc dot |ccoutant at gcc dot gnu.org |gnu.org | --- Comment #4 from Cary Coutant ccoutant at gcc dot gnu.org 2012-06-25 21:19:17 UTC --- Created attachment 27705 -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=27705 Patch to fix ICE with -g -flto and anonymous namespace You can't delay producing pubnames this way with LTO. Please fix. The obvious problem is that we're calling langhooks.dwarf_name (in gen_namespace_die) for an anonymous namespace, even with the default -gno-pubnames. I can fix that by adding a check for want_pubnames just before the call to add_pubname_string, as in the patch below. But this is still going to ICE if you turn on -gpubnames with -lto. The only way I can think of to fix that is relax the assert in lhd_decl_printable_name, and just have it return an empty string in the DECL_NAMELESS case. That will not produce the right results for an anonmyous namespace, but without front-end langhooks available to us (and until we implement the lazy debug plan), how can we do better? How much is expected to work today with LTO and -g? Aren't we still stuck with calling langhooks from dwarf2out.c back-end routines? I can understand that we don't want to ICE, but what guarantees do we make about debug info? -cary -- Configure bugmail: http://gcc.gnu.org/bugzilla/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are on the CC list for the bug. On Mon, Jun 25, 2012 at 2:19 PM, ccoutant at gcc dot gnu.org gcc-bugzi...@gcc.gnu.org wrote: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53754 Cary Coutant ccoutant at gcc dot gnu.org changed: What |Removed |Added Status|NEW |ASSIGNED AssignedTo|unassigned at gcc dot |ccoutant at gcc dot gnu.org |gnu.org | --- Comment #4 from Cary Coutant ccoutant at gcc dot gnu.org 2012-06-25 21:19:17 UTC --- Created attachment 27705 -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=27705 Patch to fix ICE with -g -flto and anonymous namespace You can't delay producing pubnames this way with LTO. Please fix. The obvious problem is that we're calling langhooks.dwarf_name (in gen_namespace_die) for an anonymous namespace, even with the default -gno-pubnames. I can fix that by adding a check for want_pubnames just before the call to add_pubname_string, as in the patch below. But this is still going to ICE if you turn on -gpubnames with -lto. The only way I can think of to fix that is relax the assert in lhd_decl_printable_name, and just have it return an empty string in the DECL_NAMELESS case. That will not produce the right results for an anonmyous namespace, but without front-end langhooks available to us (and until we implement the lazy debug plan), how can we do better? How much is expected to work today with LTO and -g? Aren't we still stuck with calling langhooks from dwarf2out.c back-end routines? I can understand that we don't want to ICE, but what guarantees do we make about debug info? -cary -- Configure bugmail: http://gcc.gnu.org/bugzilla/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are on the CC list for the bug.
Re: [PATCH] gfortran testsuite: implicitly cleanup-modules
On Jun 28, 2012, at 3:27 PM, Bernhard Reutner-Fischer wrote: Perhaps you want to pursue this? We'd need to suggest this to dejagnu, Actually, we have the technology, so that isn't necessary. :-) You can install replacements for any procs you want, not pretty, but... it does work. I think this is a more deterministic path forward than waiting for a mythical dejagnu release. Also, we then can avoid the hassle of requiring a new dejagnu.
Re: [PATCH] gfortran testsuite: implicitly cleanup-modules
On Thu, Jun 28, 2012 at 04:43:05PM -0700, Mike Stump wrote: On Jun 28, 2012, at 3:27 PM, Bernhard Reutner-Fischer wrote: Perhaps you want to pursue this? We'd need to suggest this to dejagnu, Actually, we have the technology, so that isn't necessary. :-) You can install replacements for any procs you want, not pretty, but... it does work. I think this is a more deterministic path forward than waiting for a mythical dejagnu release. Also, we then can avoid the hassle of requiring a new dejagnu. Wouldn't that mean that we have to completely replace proc load_lib? But anyway. Mike, it would be nice if you could fix +# BUG: gcc-dg calls gcc-set-multilib-library-path but does not load gcc-defs! if you did not do that already -- TIA :) That's under the assumption that one should be able to use the major lib/*exp without including their pre-requisites first. cheers,
[testsuite] gcc.dg/Wstrict-aliasing-converted-assigned.c: fix dg-message errors
Test gcc.dg/Wstrict-aliasing-converted-assigned.c uses a combination of target and xfail selectors in a way that would be nice if it worked, but it doesn't. Unfortunately the local code to override dg-error and friends ignores errors, so directives with errors have been silently skipped. I plan to fix that after fixing the affected tests. This patch causes the affected dg-message directives in this test to be XFAIL'd everywhere, with a comment asking that when the test starts passing on the relevant targets, the xfail be replaced with a target list. It also adds comments to the dg-message directives to make their messages unique in the test summary. Tested on i686-pc-linux-gnu; OK for trunk? Janis 2012-06-28 Janis Johnson jani...@codesourcery.com * gcc.dg/Wstrict-aliasing-converted-assigned.c: Fix syntax errors in dg-message directives, add comments. Index: gcc.dg/Wstrict-aliasing-converted-assigned.c === --- gcc.dg/Wstrict-aliasing-converted-assigned.c(revision 189025) +++ gcc.dg/Wstrict-aliasing-converted-assigned.c(working copy) @@ -5,9 +5,12 @@ int foo() { int i; - *(long*)i = 0; /* { dg-warning type-punn } */ + *(long*)i = 0; /* { dg-warning type-punn type-punn } */ return i; } -/* { dg-message does break strict-aliasing { target { *-*-* lp64 } xfail *-*-* } 8 } */ -/* { dg-message initialized { target { *-*-* lp64 } xfail *-*-* } 8 } */ +/* These messages are only expected for lp64, but fail there. When they + pass for lp64, replace xfail *-*-* with target lp64. */ + +/* { dg-message does break strict-aliasing break { xfail *-*-* } 8 } */ +/* { dg-message initialized init { xfail *-*-* } 8 } */
[testsuite] add required comments to dg-message directives in g++.dg
Several tests in g++.dg use dg-message with a target list and line number but without the comment field, which is required when those additional arguments are used. The local replacement of dg-message silently ignores errors (something I plan to fix), so the checks have been ignored. Unprocessed notes (as opposed to errors and warning) in compiler output are intentionally ignored, so this wasn't noticed before.. This patch adds the required comments, and the tests now pass on i686-pc-linux-gnu. OK for trunk? Janis 2012-06-28 Janis Johnson jani...@codesourcery.com * g++.dg/template/error46.C: Add missing comment to dg-message. * g++.dg/template/crash107.C: Likewise. * g++.dg/template/error47.C: Likewise. * g++.dg/template/crash108.C: Likewise. * g++.dg/overload/operator5.C: Likewise. Index: g++.dg/template/error46.C === --- g++.dg/template/error46.C (revision 189025) +++ g++.dg/template/error46.C (working copy) @@ -8,4 +8,4 @@ { foo(A0(), A1()); // { dg-error no matching } } -// { dg-message candidate|parameter 'N' ('0' and '1') { target *-*-* } 9 } +// { dg-message candidate|parameter 'N' ('0' and '1') { target *-*-* } 9 } Index: g++.dg/template/crash107.C === --- g++.dg/template/crash107.C (revision 189025) +++ g++.dg/template/crash107.C (working copy) @@ -14,7 +14,7 @@ } }; Vecdouble v(3,4,12); // { dg-error no matching } -// { dg-message note { target *-*-* } 16 } +// { dg-message note note { target *-*-* } 16 } Vecdouble V(12,4,3); // { dg-error no matching } -// { dg-message note { target *-*-* } 18 } +// { dg-message note note { target *-*-* } 18 } Vecdouble c = v^V; // { dg-message required } Index: g++.dg/template/error47.C === --- g++.dg/template/error47.C (revision 189025) +++ g++.dg/template/error47.C (working copy) @@ -6,4 +6,4 @@ { foo(0, p); // { dg-error no matching } } -// { dg-message candidate|parameter 'T' ('int' and 'void*') { target *-*-* } 7 } +// { dg-message candidate|parameter 'T' ('int' and 'void*') { target *-*-* } 7 } Index: g++.dg/template/crash108.C === --- g++.dg/template/crash108.C (revision 189025) +++ g++.dg/template/crash108.C (working copy) @@ -2,4 +2,4 @@ templateclass T struct A {A(int b=k(0));}; // { dg-error arguments } void f(int k){Aint a;} // // { dg-error parameter|declared } -// { dg-message note { target *-*-* } 3 } +// { dg-message note note { target *-*-* } 3 } Index: g++.dg/overload/operator5.C === --- g++.dg/overload/operator5.C (revision 189025) +++ g++.dg/overload/operator5.C (working copy) @@ -13,4 +13,4 @@ const String b, bool ignoreCase) { return ignoreCase ? equalIgnoringCase(a, b) : (a == b); } // { dg-error ambiguous } -// { dg-message note { target *-*-* } 15 } +// { dg-message note note { target *-*-* } 15 }
[testsuite] g++.dg/cpp0x/nullptr19.c: remove duplicate dg-message
Test g++.dg/cpp0x/nullptr19.c contains the following: char* k( char* ); /* { dg-message note } { dg-message note } */ nullptr_t k( nullptr_t ); /* { dg-message note } { dg-message note } */ Having two test directives on a line should have resulted in an ERROR but the local replacement of dg-warning silently ignores errors (something I plan to fix). There are two notes for each of these lines, identical but after different candidate lists. Since they are identical DejaGnu removes both of them after one has been processed, and there is apparently no way to check for both of them. At least with this patch we'll correctly check for one for each line. Tested on i686-pc-linux-gnu; OK for trunk? Janis 2012-06-28 Janis Johnson jani...@codesourcery.com * g++.dg/cpp0x/nullptr19.c: Remove exta directives on same line. Index: g++.dg/cpp0x/nullptr19.C === --- g++.dg/cpp0x/nullptr19.C(revision 189025) +++ g++.dg/cpp0x/nullptr19.C(working copy) @@ -5,8 +5,8 @@ typedef decltype(nullptr) nullptr_t; -char* k( char* ); /* { dg-message note } { dg-message note } */ -nullptr_t k( nullptr_t ); /* { dg-message note } { dg-message note } */ +char* k( char* ); /* { dg-message note } */ +nullptr_t k( nullptr_t ); /* { dg-message note } */ void test_k() {
Re: [testsuite] g++.dg/cpp0x/nullptr19.c: remove duplicate dg-message
On Jun 28, 2012, at 5:57 PM, Janis Johnson wrote: Test g++.dg/cpp0x/nullptr19.c contains the following: OK for trunk? Ok.
Re: [testsuite] add required comments to dg-message directives in g++.dg
On Jun 28, 2012, at 5:56 PM, Janis Johnson wrote: Several tests in g++.dg use dg-message with a target list and line number but without the comment field, which is required when those additional arguments are used. OK for trunk? Ok.
Re: [testsuite] gcc.dg/Wstrict-aliasing-converted-assigned.c: fix dg-message errors
On Jun 28, 2012, at 5:55 PM, Janis Johnson wrote: Test gcc.dg/Wstrict-aliasing-converted-assigned.c uses a combination of target and xfail selectors in a way that would be nice if it worked, OK for trunk? Ok. I prefer no spacing between the comment and the dg-message lines... ok either way.
Re: [PATCH] gfortran testsuite: implicitly cleanup-modules
On Jun 28, 2012, at 5:15 PM, Bernhard Reutner-Fischer wrote: On Thu, Jun 28, 2012 at 04:43:05PM -0700, Mike Stump wrote: On Jun 28, 2012, at 3:27 PM, Bernhard Reutner-Fischer wrote: Perhaps you want to pursue this? We'd need to suggest this to dejagnu, Actually, we have the technology, so that isn't necessary. :-) You can install replacements for any procs you want, not pretty, but... it does work. I think this is a more deterministic path forward than waiting for a mythical dejagnu release. Also, we then can avoid the hassle of requiring a new dejagnu. Wouldn't that mean that we have to completely replace proc load_lib? Yes; worse, it is a cut-n-paste from dejagnu and can effectively rev lock us to the current dejagnu release... One can delegate, but I don't think any pre or post processing in this case is enough to `fix' the issue, so it would be a wholesale replacement. But anyway. Mike, it would be nice if you could fix +# BUG: gcc-dg calls gcc-set-multilib-library-path but does not load gcc-defs! Sounds like a single line fix. It is the testing of that fix that is the annoying part.
Re: [testsuite] gcc.dg/vect/vect-50.c: combine two scans
On Jun 28, 2012, at 10:26 AM, Janis Johnson wrote: No, there is no way to combine target and xfail, Ah... Grrr I hate non-composability. Given that, I think the original patch is fine, subject of course to the wants and wishes of vect people.
Ping: Reorganized documentation for warnings -- attempt 2
http://gcc.gnu.org/ml/gcc-patches/2012-06/msg01208.html
Re: New option to turn off stack reuse for temporaries
(re-post in plain text) Moving this to cfgexpand time is simple and it can also be extended to handle scoped variables. However Jakub raised a good point about this being too late as stack space overlay is not the only way to cause trouble when the lifetime of a stack object is extended beyond the clobber stmt. thanks, David On Tue, Jun 26, 2012 at 1:28 AM, Richard Guenther richard.guent...@gmail.com wrote: On Mon, Jun 25, 2012 at 6:25 PM, Xinliang David Li davi...@google.com wrote: Are there any more concerns about this patch? If not, I'd like to check it in. No - the fact that the flag is C++ specific but in common.opt is odd enough and -ftemp-reuse-stack sounds very very generic - which in fact it is not, it's a no-op in C. Is there a more formal phrase for the temporary kind that is affected? For me temp is synonymous to auto so I'd have expected the switch to turn off stack slot sharing for { int a[5]; } { int a[5]; } but that is not what it does. So - a little kludgy but probably more to what I'd like it to be would be to move the option to c-family/c.opt enabled only for C++ and Obj-C++ and export it to the middle-end via a new langhook (the gimplifier code should be in Frontend code that lowers to GENERIC really and the WITH_CLEANUP_EXPR code should be C++ frontend specific ...). Thanks, Richard. thanks, David On Fri, Jun 22, 2012 at 8:51 AM, Xinliang David Li davi...@google.com wrote: On Fri, Jun 22, 2012 at 2:39 AM, Richard Guenther richard.guent...@gmail.com wrote: On Fri, Jun 22, 2012 at 11:29 AM, Jason Merrill ja...@redhat.com wrote: On 06/22/2012 01:30 AM, Richard Guenther wrote: What other issues? It enables more potential code motion, but on the other hand, causes more conservative stack reuse. As far I can tell, the handling of temporaries is added independently after the clobber for scoped variables are introduced. This option can be used to restore the older behavior (in handling temps). Well, it does not really restore the old behavior (if you mean before adding CLOBBERS, not before the single patch that might have used those for gimplifying WITH_CLEANUP_EXPR). You say it disables stack-slot sharing for those decls but it also does other things via side-effects of no longer emitting the CLOBBER. I say it's better to disable the stack-slot sharing. The patch exactly restores the behavior of temporaries from before my change to add CLOBBERs for temporaries. The primary effect of that change was to provide stack-slot sharing, but if there are other effects they are probably desirable as well, since the broken code depended on the old behavior. So you see it as workaround option, like -fno-strict-aliasing, rather than debugging aid? It can be used for both purposes -- if the violations are as pervasive as strict-aliasing cases (which looks like so). thanks, David Richard. Jason
Re: [PATCH] Add MULT_HIGHPART_EXPR
On Fri, Jun 29, 2012 at 12:00:10AM +0200, Bernhard Reutner-Fischer wrote: Really both HI? If so optab2 could be removed from that fn altogether.. Of course, thanks for pointing that out. I've additionally added a result mode check (similar to what supportable_widening_operation does). The reason for not using supportable_widening_operation is that it only tests even/odd calls for reductions, while we can use them everywhere. Committed as obvious. 2012-06-29 Jakub Jelinek ja...@redhat.com * tree-vect-stmts.c (vectorizable_operation): Check both VEC_WIDEN_MULT_LO_EXPR and VEC_WIDEN_MULT_HI_EXPR optabs. Verify that operand[0]'s mode is TYPE_MODE (wide_vectype). --- gcc/tree-vect-stmts.c (revision 189053) +++ gcc/tree-vect-stmts.c (working copy) @@ -3504,14 +3504,19 @@ vectorizable_operation (gimple stmt, gim { decl1 = NULL_TREE; decl2 = NULL_TREE; - optab = optab_for_tree_code (VEC_WIDEN_MULT_HI_EXPR, + optab = optab_for_tree_code (VEC_WIDEN_MULT_LO_EXPR, vectype, optab_default); optab2 = optab_for_tree_code (VEC_WIDEN_MULT_HI_EXPR, vectype, optab_default); if (optab != NULL optab2 != NULL optab_handler (optab, vec_mode) != CODE_FOR_nothing - optab_handler (optab2, vec_mode) != CODE_FOR_nothing) + optab_handler (optab2, vec_mode) != CODE_FOR_nothing + insn_data[optab_handler (optab, vec_mode)].operand[0].mode +== TYPE_MODE (wide_vectype) + insn_data[optab_handler (optab2, + vec_mode)].operand[0].mode +== TYPE_MODE (wide_vectype)) { for (i = 0; i nunits_in; i++) sel[i] = !BYTES_BIG_ENDIAN + 2 * i; Jakub