[patch] [fixincludes] Ignore .DS_Store junk files when running make check
I've attached trivial, 1-line patch to fixincludes_check.tpl; it allows 'make check' to succeed on OS X, by ignoring the files that Finder creates to keep track of the status of directories. -Eric Gallager fixincludes/check.tpl | 1 + 1 file changed, 1 insertion(+) diff --git a/fixincludes/check.tpl b/fixincludes/check.tpl index 0d1f444..300aeac 100644 --- a/fixincludes/check.tpl +++ b/fixincludes/check.tpl @@ -143,6 +143,7 @@ cd $TESTBASE find * -type f -print | \ fgrep -v 'CVS/' | \ +fgrep -v '.DS_Store' | \ fgrep -v '.svn/' > ${TESTDIR}/LIST exitok=`
[patch, committed] minor cleanup in nios2 backend
I've checked in the attached patch to clean up a few stylistic issues in the nios2 back end, e.g. switching to the new "rtx_insn *" type instead of "rtx" for things that are explicitly insns. There's no change in functionality. -Sandra 2015-07-04 Sandra Loosemore gcc/ * config/nios2/nios2.c (save_reg, restore_reg): Use plus_constant. Use rtx_insn * instead of rtx. (nios2_emit_add_constant): Use rtx_insn * instead of rtx. (nios2_expand_prologue, nios2_expand_epilogue): Likewise. (nios2_call_tls_get_addr): Likewise. (nios2_emit_expensive_div): Likewise. (nios2_emit_move_sequence): Change return type to bool. * config/nios2/nios2-protos.h (nios2_emit_move_sequence): Change return type to bool. Index: gcc/config/nios2/nios2.c === --- gcc/config/nios2/nios2.c (revision 225323) +++ gcc/config/nios2/nios2.c (working copy) @@ -446,9 +446,8 @@ static void save_reg (int regno, unsigned offset) { rtx reg = gen_rtx_REG (SImode, regno); - rtx addr = gen_rtx_PLUS (Pmode, stack_pointer_rtx, - gen_int_mode (offset, Pmode)); - rtx insn = emit_move_insn (gen_frame_mem (Pmode, addr), reg); + rtx addr = plus_constant (Pmode, stack_pointer_rtx, offset, false); + rtx_insn *insn = emit_move_insn (gen_frame_mem (Pmode, addr), reg); RTX_FRAME_RELATED_P (insn) = 1; } @@ -456,9 +455,8 @@ static void restore_reg (int regno, unsigned offset) { rtx reg = gen_rtx_REG (SImode, regno); - rtx addr = gen_rtx_PLUS (Pmode, stack_pointer_rtx, - gen_int_mode (offset, Pmode)); - rtx insn = emit_move_insn (reg, gen_frame_mem (Pmode, addr)); + rtx addr = plus_constant (Pmode, stack_pointer_rtx, offset, false); + rtx_insn *insn = emit_move_insn (reg, gen_frame_mem (Pmode, addr)); /* Tag epilogue unwind note. */ add_reg_note (insn, REG_CFA_RESTORE, reg); RTX_FRAME_RELATED_P (insn) = 1; @@ -479,10 +477,10 @@ nios2_emit_stack_limit_check (void) /* Temp regno used inside prologue/epilogue. */ #define TEMP_REG_NUM 8 -static rtx +static rtx_insn * nios2_emit_add_constant (rtx reg, HOST_WIDE_INT immed) { - rtx insn; + rtx_insn *insn; if (SMALL_INT (immed)) insn = emit_insn (gen_add2_insn (reg, gen_int_mode (immed, Pmode))); else @@ -501,7 +499,7 @@ nios2_expand_prologue (void) int total_frame_size, save_offset; int sp_offset; /* offset from base_reg to final stack value. */ int save_regs_base; /* offset from base_reg to register save area. */ - rtx insn; + rtx_insn *insn; total_frame_size = nios2_compute_frame_layout (); @@ -587,7 +585,8 @@ nios2_expand_prologue (void) void nios2_expand_epilogue (bool sibcall_p) { - rtx insn, cfa_adj; + rtx_insn *insn; + rtx cfa_adj; int total_frame_size; int sp_adjust, save_offset; unsigned int regno; @@ -1180,7 +1179,8 @@ nios2_call_tls_get_addr (rtx ti) { rtx arg = gen_rtx_REG (Pmode, FIRST_ARG_REGNO); rtx ret = gen_rtx_REG (Pmode, FIRST_RETVAL_REGNO); - rtx fn, insn; + rtx fn; + rtx_insn *insn; if (!nios2_tls_symbol) nios2_tls_symbol = init_one_libfunc ("__tls_get_addr"); @@ -1343,10 +1343,10 @@ nios2_emit_expensive_div (rtx *operands, rtx or_result, shift_left_result; rtx lookup_value; rtx_code_label *lab1, *lab3; - rtx insns; + rtx_insn *insns; rtx libfunc; rtx final_result; - rtx tmp; + rtx_insn *tmp; rtx table; /* It may look a little generic, but only SImode is supported for now. */ @@ -1928,7 +1928,7 @@ nios2_delegitimize_address (rtx x) } /* Main expander function for RTL moves. */ -int +bool nios2_emit_move_sequence (rtx *operands, machine_mode mode) { rtx to = operands[0]; @@ -1947,7 +1947,7 @@ nios2_emit_move_sequence (rtx *operands, operands[0] = to; operands[1] = from; - return 0; + return false; } /* The function with address *ADDR is being called. If the address Index: gcc/config/nios2/nios2-protos.h === --- gcc/config/nios2/nios2-protos.h (revision 225323) +++ gcc/config/nios2/nios2-protos.h (working copy) @@ -29,7 +29,7 @@ extern void nios2_expand_epilogue (bool) extern void nios2_function_profiler (FILE *, int); #ifdef RTX_CODE -extern int nios2_emit_move_sequence (rtx *, machine_mode); +extern bool nios2_emit_move_sequence (rtx *, machine_mode); extern void nios2_emit_expensive_div (rtx *, machine_mode); extern void nios2_adjust_call_address (rtx *, rtx);
Re: [PATCH] c/66516 - missing diagnostic on taking the address of a builtin function
I don't think c_validate_addressable is a good name - given that it's called for lots of things that aren't addressable, in contexts in which there is no need for them to be addressable, and doesn't do checks of addressability in contexts where they are actually needed and done elsewhere (e.g. checks for not taking the address of a register variable). The question seems to be something more like: is the expression used as an operand something it's OK to use as an operand at all? Thank you for the review. I've changed the name to c_reject_gcc_builtin. If you would prefer a different name still please suggest one. I'm not partial to any one in particular. What is the logic for the list of functions above being a complete list of the places that need changes? The logic by which I arrived at the changes was by constructing test cases exercising expressions where a function is a valid operand and where its address might need to be obtained when it's not available, and stepping through the code and modifying it until I found a suitable place to change to reject it. How can ifexp be of pointer type? It's undergone truthvalue conversion and should always be of type int at this point (but in any case, you can't refer to TREE_OPERAND (ifexp, 0) without knowing what sort of expression it is). I've removed the redundant test from this function. +/* For EXPR that is an ADDR_EXPR or whose type is a FUNCTION_TYPE, + determines whether its operand can have its address taken issues + an error pointing to the location LOC. + Operands that cannot have their address taken are builtin functions + that have no library fallback (no other kinds of expressions are + considered). + Returns true when the expression can have its address taken and + false otherwise. */ Apart from the naming issue, the comment says nothing about the semantics of the function for an argument that's not of that form. I've reworded the comment to hopefully make the semantics of the function more clear. Attached is an updated patch with these changes. Martin gcc/ChangeLog 2015-07-04 Martin Sebor pr c/66516 * tree.h (DECL_IS_GCC_BUILTIN): New macro. * doc/extend.texi (Other Builtins): Document when the address of a built-in function can be taken. gcc/c/ChangeLog 2015-07-04 Martin Sebor pr c/66516 * c-tree.h (c_reject_gcc_builtin): New function. * c-typeck.c (convert_arguments, parser_build_unary_op): Call it. (build_conditional_expr, c_cast_expr, convert_for_assignment): Same. (build_binary_op, c_objc_common_truthvalue_conversion): Same. (c_reject_gcc_builtin): Define function. gcc/cp/ChangeLog 2015-07-04 Martin Sebor pr c/66516 * cp-tree.h (cp_reject_gcc_builtin): New function. * call.c (build_conditional_expr_1): Call it. (convert_arg_to_ellipsis, convert_for_arg_passing): Same. * pt.c (convert_template_argument): Same. * typeck.c (cp_build_binary_op, cp_build_addr_expr_strict): Same. (cp_build_unary_op, build_static_cast_1, build_reinterpret_cast_1): Same. (cp_build_c_cast, convert_for_assignment, convert_for_initialization): Same. (cp_reject_gcc_builtin): Define function. gcc/testsuite/ChangeLog 2015-07-04 Martin Sebor pr c/66516 * g++.dg/addr_builtin-1.C: New test. * gcc.dg/addr_builtin-1.c: New test. diff --git a/gcc/c/c-tree.h b/gcc/c/c-tree.h index 28b58c6..6a492c8 100644 --- a/gcc/c/c-tree.h +++ b/gcc/c/c-tree.h @@ -655,6 +655,7 @@ extern tree c_finish_transaction (location_t, tree, int); extern bool c_tree_equal (tree, tree); extern tree c_build_function_call_vec (location_t, vec, tree, vec *, vec *); +extern bool c_reject_gcc_builtin (const_tree, location_t = UNKNOWN_LOCATION); /* Set to 0 at beginning of a function definition, set to 1 if a return statement that specifies a return value is seen. */ diff --git a/gcc/c/c-typeck.c b/gcc/c/c-typeck.c index 636e0bb..d27ace2 100644 --- a/gcc/c/c-typeck.c +++ b/gcc/c/c-typeck.c @@ -3343,6 +3343,10 @@ convert_arguments (location_t loc, vec arg_loc, tree typelist, error (invalid_func_diag); return -1; } + else if (TREE_CODE (val) == ADDR_EXPR && c_reject_gcc_builtin (val)) + { + return -1; + } else /* Convert `short' and `char' to full-size `int'. */ parmval = default_conversion (val); @@ -3380,12 +3384,20 @@ parser_build_unary_op (location_t loc, enum tree_code code, struct c_expr arg) { struct c_expr result; - result.value = build_unary_op (loc, code, arg.value, 0); result.original_code = code; result.original_type = NULL; + if (c_reject_gcc_builtin (arg.value)) +{ + result.value = error_mark_node; +} + else +{ + result.value = build_unary_op (loc, code, arg.value, 0); + if (TREE_OVERFLOW_P (result.value) && !TREE_OVERFLOW_P (arg.value)) overflow_warning (loc, result.value); +} return result; } @@ -4486,6 +4498,12 @@ build_conditional_expr (location_t colon_loc, tree ifexp, bool ifexp_bcp, type2 = TREE_TYPE (op2
Re: [gomp] Move openacc vector& worker single handling to RTL
On 07/03/15 19:11, Jakub Jelinek wrote: On Fri, Jul 03, 2015 at 06:51:57PM -0400, Nathan Sidwell wrote: IMHO this is a step towards putting target-dependent handling in the target compiler and out of the more generic host-side compiler. The changelog is separated into 3 parts - a) general infrastructure - b) additiona - c) deletions. comments? Thanks for working on it. If the builtins are not meant to be used by users directly (I assume they aren't) nor have a 1-1 correspondence to a library routine, it is much better to emit them as internal calls (see internal-fn.{c,def}) instead of BUILT_IN_NORMAL functions. thanks, Cesar pointed me at the internal builtins too -- I'll take a look. nathan
Re: [Ping, Patch, Fortran, PR58586, v5] ICE with derived type with allocatable component passed by value
Hi Steve, Thanks for looking at the code. The error you experience is known to me. The bug is present in gfortran and only exposed by this patch. Unfortunately is the pr58586 not addressing this specific error. It may be in the bugtracker under a different number already. Furthermore did I not want to extend the patch for 58586 any further, because I have learned that the more complicated a patch gets the longer review takes. For making the testcase run fine we also simply can comment the line. Regards, Andre Am 4. Juli 2015 18:24:59 MESZ, schrieb Steve Kargl : >On Fri, Jul 03, 2015 at 11:29:00AM +0200, Andre Vehreschild wrote: >> Ping! >> > >(Un)fortnuately you're working on an area of Fortran >that I don't know and in parts of the tree that takes >me a long time to decipher (aka, trans-*.c files). > >I applied your patch and see several failures. I'll >note that I did not start from a clean obj/. So, there >is the possibility that some *.o file needed to get >rebuilt but didn't. Anyhow, > >laptop-kargl:kargl[300] gfc -o z alloc_comp_class_4.f03 && ./z > >Program received signal SIGSEGV: Segmentation fault - invalid memory >reference. > >Backtrace for this error: >#0 0x2808B9A6 >#1 0x2808AB19 >#2 0xBFBFF003 >#3 0x8048DEA >#4 0x8049097 >#5 0x8048779 >Segmentation fault (core dumped) > >Hmmm, Ok, I just looked at the source for alloc_comp_class_4.f03 >and found line 94. > > temp = t_init() ! <-- This derefs a null-pointer currently > >Not sure what to make of this. -- Andre Vehreschild * Kreuzherrenstr. 8 * 52062 Aachen Mail: ve...@gmx.de * Tel.: +49 241 9291018
RE: [PATCH] Fix mips-{mti,img}-linux-gnu boot-strap
Bernd Edlinger writes: > On Sat, 4 Jul 2015 09:04:41, Richard Sandiford wrote: > > > > The final return here would also mishandle SEQUENCE PATTERNs. > > The idea was that this function would only see "real" instructions, > > so I think instead the FOR_EACH_SUBINSN should be here: > > > > static bool > > mips_find_gp_ref (bool *cache, bool (*pred) (rtx_insn *)) > > { > > rtx_insn *insn; > > > > if (!*cache) > > { > > push_topmost_sequence (); > > for (insn = get_insns (); insn; insn = NEXT_INSN (insn)) > > > > > if (USEFUL_INSN_P (insn) && pred (insn)) > > > > Thanks, > > Richard > > Yes, I agree. > > I have now updated my patch as suggested. > > A mips-img-linux-gnu cross compiler builds as expected. > > OK for trunk? OK, thanks. Matthew
Re: [Ping, Patch, Fortran, PR58586, v5] ICE with derived type with allocatable component passed by value
On Fri, Jul 03, 2015 at 11:29:00AM +0200, Andre Vehreschild wrote: > Ping! > (Un)fortnuately you're working on an area of Fortran that I don't know and in parts of the tree that takes me a long time to decipher (aka, trans-*.c files). I applied your patch and see several failures. I'll note that I did not start from a clean obj/. So, there is the possibility that some *.o file needed to get rebuilt but didn't. Anyhow, laptop-kargl:kargl[300] gfc -o z alloc_comp_class_4.f03 && ./z Program received signal SIGSEGV: Segmentation fault - invalid memory reference. Backtrace for this error: #0 0x2808B9A6 #1 0x2808AB19 #2 0xBFBFF003 #3 0x8048DEA #4 0x8049097 #5 0x8048779 Segmentation fault (core dumped) Hmmm, Ok, I just looked at the source for alloc_comp_class_4.f03 and found line 94. temp = t_init() ! <-- This derefs a null-pointer currently Not sure what to make of this. -- Steve
Re: RFC: Add ADDR_EXPR lowering (PR tree-optimization/66718)
On July 4, 2015 4:17:02 PM GMT+02:00, Jakub Jelinek wrote: >On Sat, Jul 04, 2015 at 09:20:04AM +0200, Jakub Jelinek wrote: >> On Fri, Jul 03, 2015 at 03:21:47PM +0200, Marek Polacek wrote: >> > This patch implements a new pass, called laddress, which deals with >> > lowering ADDR_EXPR assignments. Such lowering ought to help the >> > vectorizer, but it also could expose more CSE opportunities, maybe >> > help reassoc, etc. It's only active when optimize != 0. >> > >> > So e.g. >> > _1 = (sizetype) i_9; >> > _7 = _1 * 4; >> > _4 = &b + _7; >> > instead of >> > _4 = &b[i_9]; >> > >> > This triggered 14105 times during the regtest and 6392 times during >> > the bootstrap. >> > >> > The fallout (at least on x86_64) is surprisingly small, i.e. none, >just >> > gcc.dg/vect/pr59984.c test (using -fopenmp-simd) ICEs, but that is >due >> > to a bug in the vectorizer. Jakub has a patch and knows the >details. >> > As the test shows, we're now able to vectorize ADDR_EXPR of >non-invariants >> > (that was the motivation of this pass). >> >> Just FYI, while bootstrapping/regtesting your patch together with the >one >> I've posted and another one to vectorize pr59984.c better, I've >noticed there >> is another regression with your patch (reverting my patches doesn't >help, >> disabling your gate does help): >> >> +FAIL: libgomp.c/simd-3.c execution test >> +FAIL: libgomp.c++/simd-3.C execution test >> >> on both x86_64-linux and i686-linux (at least on AVX capable box). >> Most likely hitting another latent vectorizer issue, haven't analyzed >it >> yet. > >Here is a fix for that, bootstrapped/regtested on x86_64-linux and >i686-linux on top of Marek's patch and my two other patches, ok for >trunk? OK. Thanks, Richard. >The problem was that in loops that don't need any scalar post-loop the >GOMP_SIMD_LAST_LANE value was wrong - 0 instead of vf - 1. > >2015-07-04 Jakub Jelinek > > PR tree-optimization/66718 > * tree-vect-stmts.c (vectorizable_call): Replace uses of > GOMP_SIMD_LANE outside of loop with vf - 1 rather than 0. > >--- gcc/tree-vect-stmts.c.jj 2015-07-03 20:43:42.0 +0200 >+++ gcc/tree-vect-stmts.c 2015-07-04 14:08:18.659356110 +0200 >@@ -2601,6 +2601,30 @@ vectorizable_call (gimple gs, gimple_stm > lhs = gimple_call_lhs (STMT_VINFO_RELATED_STMT (stmt_info)); > else > lhs = gimple_call_lhs (stmt); >+ >+ if (gimple_call_internal_p (stmt) >+ && gimple_call_internal_fn (stmt) == IFN_GOMP_SIMD_LANE) >+{ >+ /* Replace uses of the lhs of GOMP_SIMD_LANE call outside the >loop >+ with vf - 1 rather than 0, that is the last iteration of the >+ vectorized loop. */ >+ imm_use_iterator iter; >+ use_operand_p use_p; >+ gimple use_stmt; >+ FOR_EACH_IMM_USE_STMT (use_stmt, iter, lhs) >+ { >+basic_block use_bb = gimple_bb (use_stmt); >+if (use_bb >+&& !flow_bb_inside_loop_p (LOOP_VINFO_LOOP (loop_vinfo), >use_bb)) >+ { >+FOR_EACH_IMM_USE_ON_STMT (use_p, iter) >+ SET_USE (use_p, build_int_cst (TREE_TYPE (lhs), >+ ncopies * nunits_out - 1)); >+update_stmt (use_stmt); >+ } >+ } >+} >+ > new_stmt = gimple_build_assign (lhs, build_zero_cst (type)); > set_vinfo_for_stmt (new_stmt, stmt_info); > set_vinfo_for_stmt (stmt, NULL); > > > Jakub
Clean-ups in match.pd
Hello, these are just some minor changes. I believe I had already promised a build_ function to match integer_each_onep. Bootstrap+testsuite on powerpc64le-unknown-linux-gnu (it looks like *-match.c takes about 10 minutes to compile in stage2 these days). 2015-07-06 Marc Glisse * match.pd: Remove element_mode inside HONOR_*. (~ (-A) -> A - 1, ~ (A - 1) -> -A): Handle complex types. (~X | X -> -1, ~X ^ X -> -1): Merge. * tree.c (build_each_one_cst): New function. * tree.h (build_each_one_cst): Likewise. -- Marc GlisseIndex: match.pd === --- match.pd(revision 225411) +++ match.pd(working copy) @@ -101,7 +101,7 @@ negative value by 0 gives -0, not +0. */ (simplify (mult @0 real_zerop@1) - (if (!HONOR_NANS (type) && !HONOR_SIGNED_ZEROS (element_mode (type))) + (if (!HONOR_NANS (type) && !HONOR_SIGNED_ZEROS (type)) @1)) /* In IEEE floating point, x*1 is not equivalent to x for snans. @@ -108,8 +108,8 @@ Likewise for complex arithmetic with signed zeros. */ (simplify (mult @0 real_onep) - (if (!HONOR_SNANS (element_mode (type)) - && (!HONOR_SIGNED_ZEROS (element_mode (type)) + (if (!HONOR_SNANS (type) + && (!HONOR_SIGNED_ZEROS (type) || !COMPLEX_FLOAT_TYPE_P (type))) (non_lvalue @0))) @@ -116,8 +116,8 @@ /* Transform x * -1.0 into -x. */ (simplify (mult @0 real_minus_onep) - (if (!HONOR_SNANS (element_mode (type)) - && (!HONOR_SIGNED_ZEROS (element_mode (type)) + (if (!HONOR_SNANS (type) + && (!HONOR_SIGNED_ZEROS (type) || !COMPLEX_FLOAT_TYPE_P (type))) (negate @0))) @@ -165,7 +165,7 @@ (rdiv @0 @0) (if (FLOAT_TYPE_P (type) && ! HONOR_NANS (type) - && ! HONOR_INFINITIES (element_mode (type))) + && ! HONOR_INFINITIES (type)) { build_one_cst (type); })) /* Optimize -A / A to -1.0 if we don't care about @@ -174,19 +174,19 @@ (rdiv:c @0 (negate @0)) (if (FLOAT_TYPE_P (type) && ! HONOR_NANS (type) - && ! HONOR_INFINITIES (element_mode (type))) + && ! HONOR_INFINITIES (type)) { build_minus_one_cst (type); })) /* In IEEE floating point, x/1 is not equivalent to x for snans. */ (simplify (rdiv @0 real_onep) - (if (!HONOR_SNANS (element_mode (type))) + (if (!HONOR_SNANS (type)) (non_lvalue @0))) /* In IEEE floating point, x/-1 is not equivalent to -x for snans. */ (simplify (rdiv @0 real_minus_onep) - (if (!HONOR_SNANS (element_mode (type))) + (if (!HONOR_SNANS (type)) (negate @0))) /* If ARG1 is a constant, we can convert this to a multiply by the @@ -297,9 +297,10 @@ @1) /* ~x | x -> -1 */ -(simplify - (bit_ior:c (convert? @0) (convert? (bit_not @0))) - (convert { build_all_ones_cst (TREE_TYPE (@0)); })) +(for op (bit_ior bit_xor plus) + (simplify + (op:c (convert? @0) (convert? (bit_not @0))) + (convert { build_all_ones_cst (TREE_TYPE (@0)); }))) /* x ^ x -> 0 */ (simplify @@ -311,11 +312,6 @@ (bit_xor @0 integer_all_onesp@1) (bit_not @0)) -/* ~X ^ X is -1. */ -(simplify - (bit_xor:c (bit_not @0) @0) - { build_all_ones_cst (type); }) - /* x & ~0 -> x */ (simplify (bit_and @0 integer_all_onesp) @@ -603,11 +599,11 @@ (simplify (bit_not (convert? (negate @0))) (if (tree_nop_conversion_p (type, TREE_TYPE (@0))) - (convert (minus @0 { build_one_cst (TREE_TYPE (@0)); } + (convert (minus @0 { build_each_one_cst (TREE_TYPE (@0)); } /* Convert ~ (A - 1) or ~ (A + -1) to -A. */ (simplify - (bit_not (convert? (minus @0 integer_onep))) + (bit_not (convert? (minus @0 integer_each_onep))) (if (tree_nop_conversion_p (type, TREE_TYPE (@0))) (convert (negate @0 (simplify Index: tree.c === --- tree.c (revision 225411) +++ tree.c (working copy) @@ -1968,6 +1968,21 @@ return t; } +/* Return the constant 1 in type TYPE. If TYPE has several elements, each + element is set to 1. In particular, this is 1 + i for complex types. */ + +tree +build_each_one_cst (tree type) +{ + if (TREE_CODE (type) == COMPLEX_TYPE) +{ + tree scalar = build_one_cst (TREE_TYPE (type)); + return build_complex (type, scalar, scalar); +} + else +return build_one_cst (type); +} + /* Return a constant of arithmetic type TYPE which is the multiplicative identity of the set TYPE. */ Index: tree.h === --- tree.h (revision 225411) +++ tree.h (working copy) @@ -3772,6 +3772,7 @@ extern tree build_constructor_va (tree, int, ...); extern tree build_real_from_int_cst (tree, const_tree); extern tree build_complex (tree, tree, tree); +extern tree build_each_one_cst (tree); extern tree build_one_cst (tree); extern tree build_minus_one_cst (tree); extern tree build_all_ones_cst (tree);
Re: RFC: Add ADDR_EXPR lowering (PR tree-optimization/66718)
On Fri, Jul 03, 2015 at 04:06:26PM +0200, Jakub Jelinek wrote: > In the pr59984.c testcase, with Marek's patch and this patch, one loop in > test is already vectorized (the ICE was on the other one), I'll work on > recognizing multiples of GOMP_SIMD_LANE () as linear next, so that we > vectorize also the loop with bar. Without Marek's patch we weren't And here is a patch to vectorize everything in pr59984.c. For the purpose of elemental functions, addresses of variables in simd magic arrays (e.g. private, linear, reduction etc.) are linear, but they aren't linear in the whole loop, just are linear within the vectorization factor. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? This one really depends on Marek's patch, doesn't make sense to commit before it goes in. 2015-07-04 Jakub Jelinek PR tree-optimization/66718 * tree-vect-stmts.c (struct simd_call_arg_info): Add simd_lane_linear field. (vectorizable_simd_clone_call): Support using linear arguments for addresses of arrays elements indexed by GOMP_SIMD_LANE result. --- gcc/tree-vect-stmts.c.jj2015-07-03 14:06:28.0 +0200 +++ gcc/tree-vect-stmts.c 2015-07-03 20:43:42.796573076 +0200 @@ -2618,6 +2618,7 @@ struct simd_call_arg_info enum vect_def_type dt; HOST_WIDE_INT linear_step; unsigned int align; + bool simd_lane_linear; }; /* Function vectorizable_simd_clone_call. @@ -2702,6 +2703,7 @@ vectorizable_simd_clone_call (gimple stm thisarginfo.linear_step = 0; thisarginfo.align = 0; thisarginfo.op = NULL_TREE; + thisarginfo.simd_lane_linear = false; op = gimple_call_arg (stmt, i); if (!vect_is_simple_use_1 (op, stmt, loop_vinfo, bb_vinfo, @@ -2724,21 +2726,24 @@ vectorizable_simd_clone_call (gimple stm /* For linear arguments, the analyze phase should have saved the base and step in STMT_VINFO_SIMD_CLONE_INFO. */ - if (i * 2 + 3 <= STMT_VINFO_SIMD_CLONE_INFO (stmt_info).length () - && STMT_VINFO_SIMD_CLONE_INFO (stmt_info)[i * 2 + 2]) + if (i * 3 + 4 <= STMT_VINFO_SIMD_CLONE_INFO (stmt_info).length () + && STMT_VINFO_SIMD_CLONE_INFO (stmt_info)[i * 3 + 2]) { gcc_assert (vec_stmt); thisarginfo.linear_step - = tree_to_shwi (STMT_VINFO_SIMD_CLONE_INFO (stmt_info)[i * 2 + 2]); + = tree_to_shwi (STMT_VINFO_SIMD_CLONE_INFO (stmt_info)[i * 3 + 2]); thisarginfo.op - = STMT_VINFO_SIMD_CLONE_INFO (stmt_info)[i * 2 + 1]; + = STMT_VINFO_SIMD_CLONE_INFO (stmt_info)[i * 3 + 1]; + thisarginfo.simd_lane_linear + = (STMT_VINFO_SIMD_CLONE_INFO (stmt_info)[i * 3 + 3] + == boolean_true_node); /* If loop has been peeled for alignment, we need to adjust it. */ tree n1 = LOOP_VINFO_NITERS_UNCHANGED (loop_vinfo); tree n2 = LOOP_VINFO_NITERS (loop_vinfo); - if (n1 != n2) + if (n1 != n2 && !thisarginfo.simd_lane_linear) { tree bias = fold_build2 (MINUS_EXPR, TREE_TYPE (n1), n1, n2); - tree step = STMT_VINFO_SIMD_CLONE_INFO (stmt_info)[i * 2 + 2]; + tree step = STMT_VINFO_SIMD_CLONE_INFO (stmt_info)[i * 3 + 2]; tree opt = TREE_TYPE (thisarginfo.op); bias = fold_convert (TREE_TYPE (step), bias); bias = fold_build2 (MULT_EXPR, TREE_TYPE (step), bias, step); @@ -2764,6 +2769,93 @@ vectorizable_simd_clone_call (gimple stm || thisarginfo.dt == vect_external_def) && POINTER_TYPE_P (TREE_TYPE (op))) thisarginfo.align = get_pointer_alignment (op) / BITS_PER_UNIT; + /* Addresses of array elements indexed by GOMP_SIMD_LANE are +linear too. */ + if (POINTER_TYPE_P (TREE_TYPE (op)) + && !thisarginfo.linear_step + && !vec_stmt + && thisarginfo.dt != vect_constant_def + && thisarginfo.dt != vect_external_def + && loop_vinfo + && !slp_node + && TREE_CODE (op) == SSA_NAME) + { + def_stmt = SSA_NAME_DEF_STMT (op); + if (is_gimple_assign (def_stmt) + && gimple_assign_rhs_code (def_stmt) == POINTER_PLUS_EXPR + && is_gimple_min_invariant (gimple_assign_rhs1 (def_stmt))) + { + tree base = gimple_assign_rhs1 (def_stmt); + HOST_WIDE_INT linear_step = 0; + tree v = gimple_assign_rhs2 (def_stmt); + while (v && TREE_CODE (v) == SSA_NAME) + { + tree t; + def_stmt = SSA_NAME_DEF_STMT (v); + if (is_gimple_assign (def_stmt)) + switch (gimple_assign_rhs_code (def_stmt)) + { + case PLUS_EXPR: + t = gimple_assign_rhs2 (def_stmt); + if (linear_step || TREE_CODE (t) != INTEGER_CST) + { +
Re: RFC: Add ADDR_EXPR lowering (PR tree-optimization/66718)
On Sat, Jul 04, 2015 at 09:20:04AM +0200, Jakub Jelinek wrote: > On Fri, Jul 03, 2015 at 03:21:47PM +0200, Marek Polacek wrote: > > This patch implements a new pass, called laddress, which deals with > > lowering ADDR_EXPR assignments. Such lowering ought to help the > > vectorizer, but it also could expose more CSE opportunities, maybe > > help reassoc, etc. It's only active when optimize != 0. > > > > So e.g. > > _1 = (sizetype) i_9; > > _7 = _1 * 4; > > _4 = &b + _7; > > instead of > > _4 = &b[i_9]; > > > > This triggered 14105 times during the regtest and 6392 times during > > the bootstrap. > > > > The fallout (at least on x86_64) is surprisingly small, i.e. none, just > > gcc.dg/vect/pr59984.c test (using -fopenmp-simd) ICEs, but that is due > > to a bug in the vectorizer. Jakub has a patch and knows the details. > > As the test shows, we're now able to vectorize ADDR_EXPR of non-invariants > > (that was the motivation of this pass). > > Just FYI, while bootstrapping/regtesting your patch together with the one > I've posted and another one to vectorize pr59984.c better, I've noticed there > is another regression with your patch (reverting my patches doesn't help, > disabling your gate does help): > > +FAIL: libgomp.c/simd-3.c execution test > +FAIL: libgomp.c++/simd-3.C execution test > > on both x86_64-linux and i686-linux (at least on AVX capable box). > Most likely hitting another latent vectorizer issue, haven't analyzed it > yet. Here is a fix for that, bootstrapped/regtested on x86_64-linux and i686-linux on top of Marek's patch and my two other patches, ok for trunk? The problem was that in loops that don't need any scalar post-loop the GOMP_SIMD_LAST_LANE value was wrong - 0 instead of vf - 1. 2015-07-04 Jakub Jelinek PR tree-optimization/66718 * tree-vect-stmts.c (vectorizable_call): Replace uses of GOMP_SIMD_LANE outside of loop with vf - 1 rather than 0. --- gcc/tree-vect-stmts.c.jj2015-07-03 20:43:42.0 +0200 +++ gcc/tree-vect-stmts.c 2015-07-04 14:08:18.659356110 +0200 @@ -2601,6 +2601,30 @@ vectorizable_call (gimple gs, gimple_stm lhs = gimple_call_lhs (STMT_VINFO_RELATED_STMT (stmt_info)); else lhs = gimple_call_lhs (stmt); + + if (gimple_call_internal_p (stmt) + && gimple_call_internal_fn (stmt) == IFN_GOMP_SIMD_LANE) +{ + /* Replace uses of the lhs of GOMP_SIMD_LANE call outside the loop +with vf - 1 rather than 0, that is the last iteration of the +vectorized loop. */ + imm_use_iterator iter; + use_operand_p use_p; + gimple use_stmt; + FOR_EACH_IMM_USE_STMT (use_stmt, iter, lhs) + { + basic_block use_bb = gimple_bb (use_stmt); + if (use_bb + && !flow_bb_inside_loop_p (LOOP_VINFO_LOOP (loop_vinfo), use_bb)) + { + FOR_EACH_IMM_USE_ON_STMT (use_p, iter) + SET_USE (use_p, build_int_cst (TREE_TYPE (lhs), + ncopies * nunits_out - 1)); + update_stmt (use_stmt); + } + } +} + new_stmt = gimple_build_assign (lhs, build_zero_cst (type)); set_vinfo_for_stmt (new_stmt, stmt_info); set_vinfo_for_stmt (stmt, NULL); Jakub
[committed] Remove condition from PA indirect_jump
The attached change removes the C condition from the PA indirect jump since it depended on operands[] and this isn't generally allowed in named patterns. Tested on hppa-unknown-linux-gnu and hppa2.0w-hp-hpux11.11. Committed to trunk and active branches. Dave -- John David Anglin dave.ang...@bell.net 2015-07-04 John David Anglin PR target/66114 * config/pa/pa.md (indirect_jump): Use pmode_register_operand instead of register_operand. Remove constraint. Index: config/pa/pa.md === --- config/pa/pa.md (revision 225280) +++ config/pa/pa.md (working copy) @@ -6844,8 +6844,8 @@ ;;; Hope this is only within a function... (define_insn "indirect_jump" - [(set (pc) (match_operand 0 "register_operand" "r"))] - "GET_MODE (operands[0]) == word_mode" + [(set (pc) (match_operand 0 "pmode_register_operand" "r"))] + "" "bv%* %%r0(%0)" [(set_attr "type" "branch") (set_attr "length" "4")])
Re: [PR66726] Factor conversion out of COND_EXPR
On July 4, 2015 2:32:11 PM GMT+02:00, Kugan wrote: >On 04/07/15 18:51, Bernhard Reutner-Fischer wrote: >> On Sat, Jul 04, 2015 at 12:58:58PM +1000, Kugan wrote: >>> Please find a patch that attempt to FIX PR66726 by factoring >conversion >>> out of COND_EXPR as explained in the PR. >>> >>> Bootstrapped and regression tested on x86-64-none-linux-gnu with no >new >>> regressions. Is this OK for trunk? >>> >>> Thanks, >>> Kugan >>> >>> >>> gcc/testsuite/ChangeLog: >>> >>> 2015-07-03 Kugan Vivekanandarajah >>> Jeff Law >>> >>> PR middle-end/66726 >>> * gcc.dg/tree-ssa/pr66726.c: New test. >> >> I'd have scanned the details dump for "factor CONVERT_EXPR out" 1 >> to make sure that it's this part that takes care of it. I meant in addition, just to be sure. Patch itself looks plausible to me but I cannot approve it. Thanks, > >Thanks for the comments. Please see the updated patch with the fixes. >Kugan > >> >>> >>> gcc/ChangeLog: >>> >>> 2015-07-03 Kugan Vivekanandarajah >>> >>> PR middle-end/66726 >>> * tree-ssa-phiopt.c (factor_out_conditional_conversion): New >function. >>> (tree_ssa_phiopt_worker): Call factor_out_conditional_conversion. >> >>> diff --git a/gcc/tree-ssa-phiopt.c b/gcc/tree-ssa-phiopt.c >>> index d2a5cee..e8af086 100644 >>> --- a/gcc/tree-ssa-phiopt.c >>> +++ b/gcc/tree-ssa-phiopt.c >> >>> @@ -410,6 +413,108 @@ replace_phi_edge_with_variable (basic_block >cond_block, >>> bb->index); >>> } >>> >>> +/* PR66726: Factor conversion out of COND_EXPR. If the argument of >the PHI >> >> s/the argument/the arguments/ >> >>> + stmt are CONVERT_STMT, factor out the conversion and perform the >conversion >>> + to the result of PHI stmt. */ >>> + >>> +static bool >>> +factor_out_conditional_conversion (edge e0, edge e1, gphi *phi, >>> + tree arg0, tree arg1) >>> +{ >>> + gimple def0 = NULL, def1 = NULL, new_stmt; >>> + tree new_arg0 = NULL_TREE, new_arg1 = NULL_TREE; >>> + tree temp, result; >>> + gimple_stmt_iterator gsi; >>> + >>> + /* One of the argument has to be SSA_NAME and other argument can >> >> s/the argument/the arguments/ >> >>> + be an SSA_NAME of INTEGER_CST. */ >>> + if ((TREE_CODE (arg0) != SSA_NAME >>> + && TREE_CODE (arg0) != INTEGER_CST) >>> + || (TREE_CODE (arg1) != SSA_NAME >>> + && TREE_CODE (arg1) != INTEGER_CST) >>> + || (TREE_CODE (arg0) == INTEGER_CST >>> + && TREE_CODE (arg1) == INTEGER_CST)) >> >> inconsistent space for the && lines above; The first should have a >> leading tab. >> >>> +return false; >>> + >>> + /* Handle only PHI statements with two arguments. TODO: If all >>> + other arguments to PHI are INTEGER_CST, we can handle more >>> + than two arguments too. */ >>> + if (gimple_phi_num_args (phi) != 2) >>> +return false; >>> + >>> + /* If arg0 is an SSA_NAME and the stmt which defines arg0 is >>> + ai CONVERT_STMT, use the LHS as new_arg0. */ >> >> s/ai/a/ >> >>> + if (TREE_CODE (arg0) == SSA_NAME) >>> +{ >>> + def0 = SSA_NAME_DEF_STMT (arg0); >>> + if (!is_gimple_assign (def0) >>> + || !CONVERT_EXPR_CODE_P (gimple_assign_rhs_code (def0))) >>> + return false; >>> + new_arg0 = gimple_assign_rhs1 (def0); >>> +} >>> + >>> + /* If arg1 is an SSA_NAME and the stmt which defines arg0 is >>> + ai CONVERT_STMT, use the LHS as new_arg1. */ >> >> s/ai/a/ >> >>> + if (TREE_CODE (arg1) == SSA_NAME) >>> +{ >>> + def1 = SSA_NAME_DEF_STMT (arg1); >>> + if (!is_gimple_assign (def1) >>> + || !CONVERT_EXPR_CODE_P (gimple_assign_rhs_code (def1))) >>> + return false; >>> + new_arg1 = gimple_assign_rhs1 (def1); >>> +} >>> + >>> + /* If arg0 is an INTEGER_CST, fold it to new type. */ >>> + if (TREE_CODE (arg0) != SSA_NAME) >>> +{ >>> + if (!POINTER_TYPE_P (TREE_TYPE (new_arg1)) >>> + && int_fits_type_p (arg0, TREE_TYPE (new_arg1))) >>> + new_arg0 = fold_convert (TREE_TYPE (new_arg1), arg0); >>> + else >>> + return false; >>> +} >>> + >>> + /* If arg1 is an INTEGER_CST, fold it to new type. */ >>> + if (TREE_CODE (arg1) != SSA_NAME) >>> +{ >>> + if (!POINTER_TYPE_P (TREE_TYPE (new_arg0)) >>> + && int_fits_type_p (arg1, TREE_TYPE (new_arg0))) >>> + new_arg1 = fold_convert (TREE_TYPE (new_arg0), arg1); >>> + else >>> + return false; >>> +} >>> + >>> + /* If types of new_arg0 and new_arg1 are different bailout. */ >>> + if (TREE_TYPE (new_arg0) != TREE_TYPE (new_arg1)) >>> +return false; >>> + >>> + /* Replace the PHI stmt with the new_arg0 and new_arg1. Also >insert >>> + a new CONVER_STMT that converts the phi results. */ >> >> s/CONVER_STMT/CONVERT_STMT/ >> >>> + gsi = gsi_after_labels (gimple_bb (phi)); >>> + result = PHI_RESULT (phi); >>> + temp = make_ssa_name (TREE_TYPE (new_arg0), phi); >>> + >>> + if (dump_file && (dump_flags & TDF_DETAILS)) >>> +{ >>> + fprintf (dump_file, "P
RE: [Patch,tree-optimization]: Add new path Splitting pass on tree ssa representation
-Original Message- From: Richard Biener [mailto:richard.guent...@gmail.com] Sent: Tuesday, June 30, 2015 4:42 PM To: Ajit Kumar Agarwal Cc: l...@redhat.com; GCC Patches; Vinod Kathail; Shail Aditya Gupta; Vidhumouli Hunsigida; Nagaraju Mekala Subject: Re: [Patch,tree-optimization]: Add new path Splitting pass on tree ssa representation On Tue, Jun 30, 2015 at 10:16 AM, Ajit Kumar Agarwal wrote: > All: > > The below patch added a new path Splitting optimization pass on SSA > representation. The Path Splitting optimization Pass moves the join > block of if-then-else same as loop latch to its predecessors and get merged > with the predecessors Preserving the SSA representation. > > The patch is tested for Microblaze and i386 target. The EEMBC/Mibench > benchmarks is run with the Microblaze target And the performance gain > of 9.15% and rgbcmy01_lite(EEMBC benchmarks). The Deja GNU tests is run for > Mircroblaze Target and no regression is seen for Microblaze target and the > new testcase attached are passed. > > For i386 bootstrapping goes through fine and the Spec cpu2000 > benchmarks is run with this patch. Following observation were seen with spec > cpu2000 benchmarks. > > Ratio of path splitting change vs Ratio of not having path splitting change > is 3653.353 vs 3652.14 for INT benchmarks. > Ratio of path splitting change vs Ratio of not having path splitting change > is 4353.812 vs 4345.351 for FP benchmarks. > > Based on comments from RFC patch following changes were done. > > 1. Added a new pass for path splitting changes. > 2. Placed the new path Splitting Optimization pass before the copy > propagation pass. > 3. The join block same as the Loop latch is wired into its > predecessors so that the CFG Cleanup pass will merge the blocks Wired > together. > 4. Copy propagation routines added for path splitting changes is not > needed as suggested by Jeff. They are removed in the patch as The copy > propagation in the copied join blocks will be done by the existing copy > propagation pass and the update ssa pass. > 5. Only the propagation of phi results of the join block with the phi > argument is done which will not be done by the existing update_ssa Or copy > propagation pass on tree ssa representation. > 6. Added 2 tests. > a) compilation check tests. >b) execution tests. > 7. Refactoring of the code for the feasibility check and finding the join > block same as loop latch node. > > [Patch,tree-optimization]: Add new path Splitting pass on tree ssa > representation. > > Added a new pass on path splitting on tree SSA representation. The path > splitting optimization does the CFG transformation of join block of the > if-then-else same as the loop latch node is moved and merged with the > predecessor blocks after preserving the SSA representation. > > ChangeLog: > 2015-06-30 Ajit Agarwal > > * gcc/Makefile.in: Add the build of the new file > tree-ssa-path-split.c > * gcc/common.opt: Add the new flag ftree-path-split. > * gcc/opts.c: Add an entry for Path splitting pass > with optimization flag greater and equal to O2. > * gcc/passes.def: Enable and add new pass path splitting. > * gcc/timevar.def: Add the new entry for TV_TREE_PATH_SPLIT. > * gcc/tree-pass.h: Extern Declaration of make_pass_path_split. > * gcc/tree-ssa-path-split.c: New file for path splitting pass. > * gcc/testsuite/gcc.dg/tree-ssa/path-split-2.c: New testcase. > * gcc/testsuite/gcc.dg/path-split-1.c: New testcase. >>I'm not 100% sure I understand the transform but what I see from the >>testcases it tail-duplicates from a conditional up to a loop latch block (not >>sure if it >>includes it and thus ends up creating a loop nest or not). >>An observation I have is that the pass should at least share the transform >>stage to some extent with the existing tracer pass (tracer.c) which >>essentially does >>the same but not restricted to loops in any way. The following piece of code from tracer.c can be shared with the existing path splitting pass. { e = find_edge (bb, bb2); copy = duplicate_block (bb2, e, bb); flush_pending_stmts (e); add_phi_args_after_copy (©, 1, NULL); } Sharing the above code of the transform stage of tracer.c with the path splitting pass has the following limitation. 1. The duplicated loop latch node is wired to its predecessors and the existing phi node in the loop latch node with the Phi arguments from its corresponding predecessors is moved to the duplicated loop latch node that is wired into its predecessors. Due To this, the duplicated loop latch nodes wired into its predecessors will not be merged with the original predecessors by CFG cleanup phase . >> So I wonder if your pass could be simply another heuristic to compute paths >> to trace in the existing tracer pass. Sorry, I
Re: [PR66726] Factor conversion out of COND_EXPR
On 04/07/15 18:51, Bernhard Reutner-Fischer wrote: > On Sat, Jul 04, 2015 at 12:58:58PM +1000, Kugan wrote: >> Please find a patch that attempt to FIX PR66726 by factoring conversion >> out of COND_EXPR as explained in the PR. >> >> Bootstrapped and regression tested on x86-64-none-linux-gnu with no new >> regressions. Is this OK for trunk? >> >> Thanks, >> Kugan >> >> >> gcc/testsuite/ChangeLog: >> >> 2015-07-03 Kugan Vivekanandarajah >> Jeff Law >> >> PR middle-end/66726 >> * gcc.dg/tree-ssa/pr66726.c: New test. > > I'd have scanned the details dump for "factor CONVERT_EXPR out" 1 > to make sure that it's this part that takes care of it. Thanks for the comments. Please see the updated patch with the fixes. Kugan > >> >> gcc/ChangeLog: >> >> 2015-07-03 Kugan Vivekanandarajah >> >> PR middle-end/66726 >> * tree-ssa-phiopt.c (factor_out_conditional_conversion): New function. >> (tree_ssa_phiopt_worker): Call factor_out_conditional_conversion. > >> diff --git a/gcc/tree-ssa-phiopt.c b/gcc/tree-ssa-phiopt.c >> index d2a5cee..e8af086 100644 >> --- a/gcc/tree-ssa-phiopt.c >> +++ b/gcc/tree-ssa-phiopt.c > >> @@ -410,6 +413,108 @@ replace_phi_edge_with_variable (basic_block cond_block, >>bb->index); >> } >> >> +/* PR66726: Factor conversion out of COND_EXPR. If the argument of the PHI > > s/the argument/the arguments/ > >> + stmt are CONVERT_STMT, factor out the conversion and perform the >> conversion >> + to the result of PHI stmt. */ >> + >> +static bool >> +factor_out_conditional_conversion (edge e0, edge e1, gphi *phi, >> + tree arg0, tree arg1) >> +{ >> + gimple def0 = NULL, def1 = NULL, new_stmt; >> + tree new_arg0 = NULL_TREE, new_arg1 = NULL_TREE; >> + tree temp, result; >> + gimple_stmt_iterator gsi; >> + >> + /* One of the argument has to be SSA_NAME and other argument can > > s/the argument/the arguments/ > >> + be an SSA_NAME of INTEGER_CST. */ >> + if ((TREE_CODE (arg0) != SSA_NAME >> + && TREE_CODE (arg0) != INTEGER_CST) >> + || (TREE_CODE (arg1) != SSA_NAME >> + && TREE_CODE (arg1) != INTEGER_CST) >> + || (TREE_CODE (arg0) == INTEGER_CST >> + && TREE_CODE (arg1) == INTEGER_CST)) > > inconsistent space for the && lines above; The first should have a > leading tab. > >> +return false; >> + >> + /* Handle only PHI statements with two arguments. TODO: If all >> + other arguments to PHI are INTEGER_CST, we can handle more >> + than two arguments too. */ >> + if (gimple_phi_num_args (phi) != 2) >> +return false; >> + >> + /* If arg0 is an SSA_NAME and the stmt which defines arg0 is >> + ai CONVERT_STMT, use the LHS as new_arg0. */ > > s/ai/a/ > >> + if (TREE_CODE (arg0) == SSA_NAME) >> +{ >> + def0 = SSA_NAME_DEF_STMT (arg0); >> + if (!is_gimple_assign (def0) >> + || !CONVERT_EXPR_CODE_P (gimple_assign_rhs_code (def0))) >> +return false; >> + new_arg0 = gimple_assign_rhs1 (def0); >> +} >> + >> + /* If arg1 is an SSA_NAME and the stmt which defines arg0 is >> + ai CONVERT_STMT, use the LHS as new_arg1. */ > > s/ai/a/ > >> + if (TREE_CODE (arg1) == SSA_NAME) >> +{ >> + def1 = SSA_NAME_DEF_STMT (arg1); >> + if (!is_gimple_assign (def1) >> + || !CONVERT_EXPR_CODE_P (gimple_assign_rhs_code (def1))) >> +return false; >> + new_arg1 = gimple_assign_rhs1 (def1); >> +} >> + >> + /* If arg0 is an INTEGER_CST, fold it to new type. */ >> + if (TREE_CODE (arg0) != SSA_NAME) >> +{ >> + if (!POINTER_TYPE_P (TREE_TYPE (new_arg1)) >> + && int_fits_type_p (arg0, TREE_TYPE (new_arg1))) >> +new_arg0 = fold_convert (TREE_TYPE (new_arg1), arg0); >> + else >> +return false; >> +} >> + >> + /* If arg1 is an INTEGER_CST, fold it to new type. */ >> + if (TREE_CODE (arg1) != SSA_NAME) >> +{ >> + if (!POINTER_TYPE_P (TREE_TYPE (new_arg0)) >> + && int_fits_type_p (arg1, TREE_TYPE (new_arg0))) >> +new_arg1 = fold_convert (TREE_TYPE (new_arg0), arg1); >> + else >> +return false; >> +} >> + >> + /* If types of new_arg0 and new_arg1 are different bailout. */ >> + if (TREE_TYPE (new_arg0) != TREE_TYPE (new_arg1)) >> +return false; >> + >> + /* Replace the PHI stmt with the new_arg0 and new_arg1. Also insert >> + a new CONVER_STMT that converts the phi results. */ > > s/CONVER_STMT/CONVERT_STMT/ > >> + gsi = gsi_after_labels (gimple_bb (phi)); >> + result = PHI_RESULT (phi); >> + temp = make_ssa_name (TREE_TYPE (new_arg0), phi); >> + >> + if (dump_file && (dump_flags & TDF_DETAILS)) >> +{ >> + fprintf (dump_file, "PHI "); >> + print_generic_expr (dump_file, gimple_phi_result (phi), 0); >> + fprintf (dump_file, >> + " changed to factor CONVERT_EXPR out from COND_EXPR.\n"); >> + fprintf (dump_file, "New PHI_RESULT is "); >> + print_generic_expr (dump_file, temp, 0); >> +
RE: [PATCH] Fix mips-{mti,img}-linux-gnu boot-strap
Hi, On Sat, 4 Jul 2015 09:04:41, Richard Sandiford wrote: > > The final return here would also mishandle SEQUENCE PATTERNs. > The idea was that this function would only see "real" instructions, > so I think instead the FOR_EACH_SUBINSN should be here: > > static bool > mips_find_gp_ref (bool *cache, bool (*pred) (rtx_insn *)) > { > rtx_insn *insn; > > if (!*cache) > { > push_topmost_sequence (); > for (insn = get_insns (); insn; insn = NEXT_INSN (insn)) > > > if (USEFUL_INSN_P (insn) && pred (insn)) > > Thanks, > Richard Yes, I agree. I have now updated my patch as suggested. A mips-img-linux-gnu cross compiler builds as expected. OK for trunk? Thanks Bernd. 2015-07-04 Bernd Edlinger PR target/66747 * config/mips/mips.c (mips_find_gp_ref): Handle instruction sequences. patch-pr66747.diff Description: Binary data
Re: [PATCH 1/2][ARM] PR/65956 AAPCS update for alignment attribute
On Sat, Jul 04, 2015 at 12:57:36PM +0200, Richard Biener wrote: > + if (!AGGREGATE_TYPE_P (type)) > +return TYPE_ALIGN (TYPE_MAIN_VARIANT (type)) > PARM_BOUNDARY; > + > + for (tree field = TYPE_FIELDS (type); field; field = DECL_CHAIN > >>> (field)) > +if (DECL_ALIGN (field) > PARM_BOUNDARY) > + return true; > > I also believe this loop is equivalent to checking TYPE_ALIGN of the > aggregate type? Is it? What if you do struct __attribute__((aligned (32))) S { char a; int b; char c; }; ? In this case, TYPE_MAIN_VARIANT of S is S itself, and has TYPE_USER_ALIGN and TYPE_ALIGN 256. Jakub
Re: [PATCH 1/2][ARM] PR/65956 AAPCS update for alignment attribute
On July 3, 2015 10:43:30 PM GMT+02:00, Richard Earnshaw wrote: >On 03/07/15 19:24, Richard Biener wrote: >> On July 3, 2015 6:11:13 PM GMT+02:00, Richard Earnshaw > wrote: >>> On 03/07/15 16:26, Alan Lawrence wrote: These include tests of structs, scalars, and vectors - only general-purpose registers are affected by the ABI rules for >>> alignment, but we can restrict the vector test to use the base AAPCS. Prior to this patch, align2.c, align3.c and align_rec1.c were >failing (the latter showing an internal inconsistency, the first two merely >>> that GCC did not obey the new ABI). With this patch, the align_rec2.c fails, and also gcc.c-torture/execute/20040709-1.c at -O0 only, both because of a >>> latent bug where we can emit strd/ldrd on an odd-numbered register in ARM state, fixed by the second patch. gcc/ChangeLog: * config/arm/arm.c (arm_needs_doubleword_align): Drop any outer alignment attribute, exploring one level down for aggregates. gcc/testsuite/ChangeLog: * gcc.target/arm/aapcs/align1.c: New. * gcc.target/arm/aapcs/align_rec1.c: New. * gcc.target/arm/aapcs/align2.c: New. * gcc.target/arm/aapcs/align_rec2.c: New. * gcc.target/arm/aapcs/align3.c: New. * gcc.target/arm/aapcs/align_rec3.c: New. * gcc.target/arm/aapcs/align4.c: New. * gcc.target/arm/aapcs/align_rec4.c: New. * gcc.target/arm/aapcs/align_vararg1.c: New. * gcc.target/arm/aapcs/align_vararg2.c: New. arm_overalign_1.patch diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c index >>> >04663999224c8c8eb8e2d10b0ec634db6ce5027e..ee57d30617a2f7e1cd63ca013fe5655a01027581 >>> 100644 --- a/gcc/config/arm/arm.c +++ b/gcc/config/arm/arm.c @@ -6020,8 +6020,17 @@ arm_init_cumulative_args (CUMULATIVE_ARGS >>> *pcum, tree fntype, static bool arm_needs_doubleword_align (machine_mode mode, const_tree type) { - return (GET_MODE_ALIGNMENT (mode) > PARM_BOUNDARY -|| (type && TYPE_ALIGN (type) > PARM_BOUNDARY)); + if (!type) +return PARM_BOUNDARY < GET_MODE_ALIGNMENT (mode); + + if (!AGGREGATE_TYPE_P (type)) +return TYPE_ALIGN (TYPE_MAIN_VARIANT (type)) > PARM_BOUNDARY; + + for (tree field = TYPE_FIELDS (type); field; field = DECL_CHAIN >>> (field)) +if (DECL_ALIGN (field) > PARM_BOUNDARY) + return true; I also believe this loop is equivalent to checking TYPE_ALIGN of the aggregate type? I'll double check your wording in the abi document, but it seems to be unclear whether packed and not packed structs should be passed the same (considering layout differences). OTOH the above function is only relevant for register passing? (Likewise the abi document changes?) >> >> Is this behavior correct for unions or aggregates with record or >union members? > >Yes, at least that was my intention. It's an error in the wording of >the proposed change, which I think should say "composite types" not >"aggregate types". > >R. > >> >>> >>> Technically this is incorrect since AGGREGATE_TYPE_P includes >>> ARRAY_TYPE >>> and ARRAY_TYPE doesn't have TYPE_FIELDS. I doubt we could reach >that >>> case though (unless there's a language that allows passing arrays by >>> value). >>> >>> For array types I think you need to check TYPE_ALIGN (TREE_TYPE >>> (type)). >>> >>> R. >>> + return false; } diff --git a/gcc/testsuite/gcc.target/arm/aapcs/align1.c >>> b/gcc/testsuite/gcc.target/arm/aapcs/align1.c new file mode 100644 index >>> >..8981d57c3eaf0bd89d224bec79ff8a45627a0a89 --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/aapcs/align1.c @@ -0,0 +1,29 @@ +/* Test AAPCS layout (alignment). */ + +/* { dg-do run { target arm_eabi } } */ +/* { dg-require-effective-target arm32 } */ +/* { dg-options "-O" } */ + +#ifndef IN_FRAMEWORK +#define TESTFILE "align1.c" + +typedef __attribute__((aligned (8))) int alignedint; + +alignedint a = 11; +alignedint b = 13; +alignedint c = 17; +alignedint d = 19; +alignedint e = 23; +alignedint f = 29; + +#include "abitest.h" +#else + ARG (alignedint, a, R0) + /* Attribute suggests R2, but we should use only natural >>> alignment: */ + ARG (alignedint, b, R1) + ARG (alignedint, c, R2) + ARG (alignedint, d, R3) + ARG (alignedint, e, STACK) + /* Attribute would suggest STACK + 8 but should be ignored: */ + LAST_ARG (alignedint, f, STACK + 4) +#endif diff --git a/gcc/testsuite/gcc.target/arm/aapcs/align2.c >>> b/gcc/testsuite/gcc.target/arm/aapcs/align2.c new file mode 100644 index >>> >..992da53c606c793f
New Swedish PO file for 'gcc' (version 5.1.0)
Hello, gentle maintainer. This is a message from the Translation Project robot. A revised PO file for textual domain 'gcc' has been submitted by the Swedish team of translators. The file is available at: http://translationproject.org/latest/gcc/sv.po (This file, 'gcc-5.1.0.sv.po', has just now been sent to you in a separate email.) All other PO files for your package are available in: http://translationproject.org/latest/gcc/ Please consider including all of these in your next release, whether official or a pretest. Whenever you have a new distribution with a new version number ready, containing a newer POT file, please send the URL of that distribution tarball to the address below. The tarball may be just a pretest or a snapshot, it does not even have to compile. It is just used by the translators when they need some extra translation context. The following HTML page has been updated: http://translationproject.org/domain/gcc.html If any question arises, please contact the translation coordinator. Thank you for all your work, The Translation Project robot, in the name of your translation coordinator.
Re: [PR66726] Factor conversion out of COND_EXPR
On Sat, Jul 04, 2015 at 12:58:58PM +1000, Kugan wrote: > Please find a patch that attempt to FIX PR66726 by factoring conversion > out of COND_EXPR as explained in the PR. > > Bootstrapped and regression tested on x86-64-none-linux-gnu with no new > regressions. Is this OK for trunk? > > Thanks, > Kugan > > > gcc/testsuite/ChangeLog: > > 2015-07-03 Kugan Vivekanandarajah > Jeff Law > > PR middle-end/66726 > * gcc.dg/tree-ssa/pr66726.c: New test. I'd have scanned the details dump for "factor CONVERT_EXPR out" 1 to make sure that it's this part that takes care of it. > > gcc/ChangeLog: > > 2015-07-03 Kugan Vivekanandarajah > > PR middle-end/66726 > * tree-ssa-phiopt.c (factor_out_conditional_conversion): New function. > (tree_ssa_phiopt_worker): Call factor_out_conditional_conversion. > diff --git a/gcc/tree-ssa-phiopt.c b/gcc/tree-ssa-phiopt.c > index d2a5cee..e8af086 100644 > --- a/gcc/tree-ssa-phiopt.c > +++ b/gcc/tree-ssa-phiopt.c > @@ -410,6 +413,108 @@ replace_phi_edge_with_variable (basic_block cond_block, > bb->index); > } > > +/* PR66726: Factor conversion out of COND_EXPR. If the argument of the PHI s/the argument/the arguments/ > + stmt are CONVERT_STMT, factor out the conversion and perform the > conversion > + to the result of PHI stmt. */ > + > +static bool > +factor_out_conditional_conversion (edge e0, edge e1, gphi *phi, > +tree arg0, tree arg1) > +{ > + gimple def0 = NULL, def1 = NULL, new_stmt; > + tree new_arg0 = NULL_TREE, new_arg1 = NULL_TREE; > + tree temp, result; > + gimple_stmt_iterator gsi; > + > + /* One of the argument has to be SSA_NAME and other argument can s/the argument/the arguments/ > + be an SSA_NAME of INTEGER_CST. */ > + if ((TREE_CODE (arg0) != SSA_NAME > + && TREE_CODE (arg0) != INTEGER_CST) > + || (TREE_CODE (arg1) != SSA_NAME > + && TREE_CODE (arg1) != INTEGER_CST) > + || (TREE_CODE (arg0) == INTEGER_CST > + && TREE_CODE (arg1) == INTEGER_CST)) inconsistent space for the && lines above; The first should have a leading tab. > +return false; > + > + /* Handle only PHI statements with two arguments. TODO: If all > + other arguments to PHI are INTEGER_CST, we can handle more > + than two arguments too. */ > + if (gimple_phi_num_args (phi) != 2) > +return false; > + > + /* If arg0 is an SSA_NAME and the stmt which defines arg0 is > + ai CONVERT_STMT, use the LHS as new_arg0. */ s/ai/a/ > + if (TREE_CODE (arg0) == SSA_NAME) > +{ > + def0 = SSA_NAME_DEF_STMT (arg0); > + if (!is_gimple_assign (def0) > + || !CONVERT_EXPR_CODE_P (gimple_assign_rhs_code (def0))) > + return false; > + new_arg0 = gimple_assign_rhs1 (def0); > +} > + > + /* If arg1 is an SSA_NAME and the stmt which defines arg0 is > + ai CONVERT_STMT, use the LHS as new_arg1. */ s/ai/a/ > + if (TREE_CODE (arg1) == SSA_NAME) > +{ > + def1 = SSA_NAME_DEF_STMT (arg1); > + if (!is_gimple_assign (def1) > + || !CONVERT_EXPR_CODE_P (gimple_assign_rhs_code (def1))) > + return false; > + new_arg1 = gimple_assign_rhs1 (def1); > +} > + > + /* If arg0 is an INTEGER_CST, fold it to new type. */ > + if (TREE_CODE (arg0) != SSA_NAME) > +{ > + if (!POINTER_TYPE_P (TREE_TYPE (new_arg1)) > + && int_fits_type_p (arg0, TREE_TYPE (new_arg1))) > + new_arg0 = fold_convert (TREE_TYPE (new_arg1), arg0); > + else > + return false; > +} > + > + /* If arg1 is an INTEGER_CST, fold it to new type. */ > + if (TREE_CODE (arg1) != SSA_NAME) > +{ > + if (!POINTER_TYPE_P (TREE_TYPE (new_arg0)) > + && int_fits_type_p (arg1, TREE_TYPE (new_arg0))) > + new_arg1 = fold_convert (TREE_TYPE (new_arg0), arg1); > + else > + return false; > +} > + > + /* If types of new_arg0 and new_arg1 are different bailout. */ > + if (TREE_TYPE (new_arg0) != TREE_TYPE (new_arg1)) > +return false; > + > + /* Replace the PHI stmt with the new_arg0 and new_arg1. Also insert > + a new CONVER_STMT that converts the phi results. */ s/CONVER_STMT/CONVERT_STMT/ > + gsi = gsi_after_labels (gimple_bb (phi)); > + result = PHI_RESULT (phi); > + temp = make_ssa_name (TREE_TYPE (new_arg0), phi); > + > + if (dump_file && (dump_flags & TDF_DETAILS)) > +{ > + fprintf (dump_file, "PHI "); > + print_generic_expr (dump_file, gimple_phi_result (phi), 0); > + fprintf (dump_file, > +" changed to factor CONVERT_EXPR out from COND_EXPR.\n"); > + fprintf (dump_file, "New PHI_RESULT is "); > + print_generic_expr (dump_file, temp, 0); > + fprintf (dump_file, " and new stmt with CONVERT_EXPR defines "); > + print_generic_expr (dump_file, result, 0); > + fprintf (dump_file, ".\n"); > +} > + > + gimple_phi_set_result (phi, temp); > + SET_PHI_ARG_DEF (phi, e0->dest_idx, new_arg0); > + SET_PHI_AR
Re: [PATCH] Fix mips-{mti,img}-linux-gnu boot-strap
Bernd Edlinger writes: > --- gcc/config/mips/mips.c.jj 2015-06-08 23:06:50.0 +0200 > +++ gcc/config/mips/mips.c2015-07-03 13:16:02.637293167 +0200 > @@ -9902,8 +9902,11 @@ mips_cfun_has_inflexible_gp_ref_p (void) > static bool > mips_insn_has_flexible_gp_ref_p (rtx_insn *insn) > { > - return (get_attr_got (insn) != GOT_UNSET > - || mips_small_data_pattern_p (PATTERN (insn)) > + rtx_insn *subinsn; > + FOR_EACH_SUBINSN (subinsn, insn) > +if (get_attr_got (subinsn) != GOT_UNSET) > + return true; > + return (mips_small_data_pattern_p (PATTERN (insn)) > || reg_overlap_mentioned_p (pic_offset_table_rtx, PATTERN (insn))); > } > The final return here would also mishandle SEQUENCE PATTERNs. The idea was that this function would only see "real" instructions, so I think instead the FOR_EACH_SUBINSN should be here: static bool mips_find_gp_ref (bool *cache, bool (*pred) (rtx_insn *)) { rtx_insn *insn; if (!*cache) { push_topmost_sequence (); for (insn = get_insns (); insn; insn = NEXT_INSN (insn)) > if (USEFUL_INSN_P (insn) && pred (insn)) Thanks, Richard
RE: [PATCH] Fix mips-{mti,img}-linux-gnu boot-strap
Bernd Edlinger writes: > Hi,this patch fixes a regression that was triggered by commit > r225260.See > pr66747 for details.Is it OK for > trunk?ThanksBernd. Thanks Bernd. I was just reviewing this PR. I think it will probably be safer to move the fix up to the mips_find_gp_ref and consider each instruction inside a sequence individually at that point. I'm not sure that mips_small_data_pattern_p is safe when given a sequence. It looks like we have never had to compute frame information after delay slot filling Previously, hence a bug in old code. I am happy to rework this to move it into mips_find_gp_ref. Thanks, Matthew
[PATCH] Fix mips-{mti,img}-linux-gnu boot-strap
Hi,this patch fixes a regression that was triggered by commit r225260.See pr66747 for details.Is it OK for trunk?ThanksBernd. 2015-07-04 Bernd Edlinger PR target/66747 * config/mips/mips.c (mips_cfun_has_inflexible_gp_ref_p): Handle instruction sequences. patch-pr66747.diff Description: Binary data
Re: RFC: Add ADDR_EXPR lowering (PR tree-optimization/66718)
On Fri, Jul 03, 2015 at 03:21:47PM +0200, Marek Polacek wrote: > This patch implements a new pass, called laddress, which deals with > lowering ADDR_EXPR assignments. Such lowering ought to help the > vectorizer, but it also could expose more CSE opportunities, maybe > help reassoc, etc. It's only active when optimize != 0. > > So e.g. > _1 = (sizetype) i_9; > _7 = _1 * 4; > _4 = &b + _7; > instead of > _4 = &b[i_9]; > > This triggered 14105 times during the regtest and 6392 times during > the bootstrap. > > The fallout (at least on x86_64) is surprisingly small, i.e. none, just > gcc.dg/vect/pr59984.c test (using -fopenmp-simd) ICEs, but that is due > to a bug in the vectorizer. Jakub has a patch and knows the details. > As the test shows, we're now able to vectorize ADDR_EXPR of non-invariants > (that was the motivation of this pass). Just FYI, while bootstrapping/regtesting your patch together with the one I've posted and another one to vectorize pr59984.c better, I've noticed there is another regression with your patch (reverting my patches doesn't help, disabling your gate does help): +FAIL: libgomp.c/simd-3.c execution test +FAIL: libgomp.c++/simd-3.C execution test on both x86_64-linux and i686-linux (at least on AVX capable box). Most likely hitting another latent vectorizer issue, haven't analyzed it yet. Jakub