On Thu, Mar 24, 2011 at 1:39 PM, Richard Guenther <richard.guent...@gmail.com> wrote: > On Thu, Mar 24, 2011 at 11:03 AM, H.J. Lu <hjl.to...@gmail.com> wrote: >> On Tue, Mar 22, 2011 at 6:28 AM, Ira Rosen <ira.ro...@linaro.org> wrote: >>> On 17 March 2011 16:48, Richard Guenther <richard.guent...@gmail.com> wrote: >>> >>>>> + then_datarefs = VEC_alloc (data_reference_p, heap, 1); >>>>> + else_datarefs = VEC_alloc (data_reference_p, heap, 1); >>>>> + then_ddrs = VEC_alloc (ddr_p, heap, 1); >>>>> + else_ddrs = VEC_alloc (ddr_p, heap, 1); >>>>> + if (!compute_data_dependences_for_bb (then_bb, false, &then_datarefs, >>>>> + &then_ddrs) >>>> >>>> Can we avoid computing dependencies if the other BB would have no >>>> data-refs? Thus, split collecting datarefs and computing dependences? >>> >>> Done. >>> >>>> >>>>> + || !compute_data_dependences_for_bb (else_bb, false, >>>>> &else_datarefs, >>>>> + &else_ddrs) >>>>> + || !VEC_length (data_reference_p, then_datarefs) >>>>> + || !VEC_length (data_reference_p, else_datarefs)) >>>>> + { >>>>> + free_data_refs (then_datarefs); >>>>> + free_data_refs (else_datarefs); >>>>> + free_dependence_relations (then_ddrs); >>>>> + free_dependence_relations (else_ddrs); >>>>> + return false; >>>>> + } >>>>> + >>>>> + /* Check that there are no read-after-write or write-after-write >>>>> dependencies >>>>> + in THEN_BB. */ >>>>> + FOR_EACH_VEC_ELT (ddr_p, then_ddrs, i, ddr) >>>>> + { >>>>> + struct data_reference *dra = DDR_A (ddr); >>>>> + struct data_reference *drb = DDR_B (ddr); >>>>> + >>>>> + if (DDR_ARE_DEPENDENT (ddr) != chrec_known >>>>> + && ((DR_IS_READ (dra) && DR_IS_WRITE (drb) >>>>> + && gimple_uid (DR_STMT (dra)) > gimple_uid (DR_STMT >>>>> (drb))) >>>>> + || (DR_IS_READ (drb) && DR_IS_WRITE (dra) >>>>> + && gimple_uid (DR_STMT (drb)) > gimple_uid (DR_STMT >>>>> (dra))) >>>>> + || (DR_IS_WRITE (dra) && DR_IS_WRITE (drb)))) >>>> >>>> The gimple_uids are not initialized here, you need to make sure to >>>> call renumber_gimple_stmt_uids () before starting. Note that phiopt >>>> incrementally changes the IL, so I'm not sure those uids will stay >>>> valid as stmts are moved across blocks. >>> >>> I added a call to renumber_gimple_stmt_uids_in_blocks() before data >>> dependence checks, and there are no code changes between that and the >>> checks, so, I think, it should be OK. >>> >>>> >>>>> + { >>>>> + free_data_refs (then_datarefs); >>>>> + free_data_refs (else_datarefs); >>>>> + free_dependence_relations (then_ddrs); >>>>> + free_dependence_relations (else_ddrs); >>>>> + return false; >>>>> + } >>>>> + } >>>>> + >>>>> + /* Check that there are no read-after-write or write-after-write >>>>> dependencies >>>>> + in ELSE_BB. */ >>>>> + FOR_EACH_VEC_ELT (ddr_p, else_ddrs, i, ddr) >>>>> + { >>>>> + struct data_reference *dra = DDR_A (ddr); >>>>> + struct data_reference *drb = DDR_B (ddr); >>>>> + >>>>> + if (DDR_ARE_DEPENDENT (ddr) != chrec_known >>>>> + && ((DR_IS_READ (dra) && DR_IS_WRITE (drb) >>>>> + && gimple_uid (DR_STMT (dra)) > gimple_uid (DR_STMT >>>>> (drb))) >>>>> + || (DR_IS_READ (drb) && DR_IS_WRITE (dra) >>>>> + && gimple_uid (DR_STMT (drb)) > gimple_uid (DR_STMT >>>>> (dra))) >>>>> + || (DR_IS_WRITE (dra) && DR_IS_WRITE (drb)))) >>>>> + { >>>>> + free_data_refs (then_datarefs); >>>>> + free_data_refs (else_datarefs); >>>>> + free_dependence_relations (then_ddrs); >>>>> + free_dependence_relations (else_ddrs); >>>>> + return false; >>>>> + } >>>>> + } >>>>> + >>>>> + /* Found pairs of stores with equal LHS. */ >>>>> + FOR_EACH_VEC_ELT (data_reference_p, then_datarefs, i, then_dr) >>>>> + { >>>>> + if (DR_IS_READ (then_dr)) >>>>> + continue; >>>>> + >>>>> + then_store = DR_STMT (then_dr); >>>>> + then_lhs = gimple_assign_lhs (then_store); >>>>> + found = false; >>>>> + >>>>> + FOR_EACH_VEC_ELT (data_reference_p, else_datarefs, j, else_dr) >>>>> + { >>>>> + if (DR_IS_READ (else_dr)) >>>>> + continue; >>>>> + >>>>> + else_store = DR_STMT (else_dr); >>>>> + else_lhs = gimple_assign_lhs (else_store); >>>>> + >>>>> + if (operand_equal_p (then_lhs, else_lhs, 0)) >>>>> + { >>>>> + found = true; >>>>> + break; >>>>> + } >>>>> + } >>>>> + >>>>> + if (!found) >>>>> + continue; >>>>> + >>>>> + res = cond_if_else_store_replacement_1 (then_bb, else_bb, join_bb, >>>>> + then_store, else_store); >>>> >>>> So you are executing if-else store replacement for common data reference >>>> pairs only. I think it's cheaper to collect those pairs before computing >>>> dependences and only if there is at least one pair perform the >>>> optimization. >>> >>> OK, I changed the order. >>> >>>> >>>> You basically perform store sinking, creating a PHI node for each store >>>> you sink (that's then probably if-converted by if-conversion later, >>>> eventually >>>> redundant with -ftree-loop-if-convert-stores?) >>>> >>>> I am concerned that having no bound on the number of stores sunk will >>>> increase register pressure and does not allow scheduling of the stores >>>> in an optimal way. Consider two BBs similar to >>>> >>>> t = a + b; >>>> *p = t; >>>> t = c + d; >>>> *q = t; >>>> >>>> where the transformation undoes a good schedule and makes fixing it >>>> impossible if the remaining statements are not if-convertible. >>>> >>>> Thus, I'd rather make this transformation only if in the end the >>>> conditional >>>> can be completely if-converted. >>>> >>>> I realize that we already do unbound and very aggressive if-conversion >>>> in tree-ifcvt.c regardless of whether the loop will be vectorized or not >>>> (including leaking the if-converted loops to the various loop versions >>>> we create during vectorization, causing only code-size bloat). But it's >>>> not a good reason to continue down this road ;) >>>> >>>> I suppose a simple maximum on the number of stores to sink >>>> controllable by a param should do, eventually disabling this >>>> extended transformation when vectorization is disabled? >>> >>> I added a param, and it's set to 0 if either vectorization or if-conversion >>> is disabled. >>> >>>> >>>> Otherwise the implementation looks good. >>> >>> Bootstrapped and tested on powerpc64-suse-linux. >>> OK to apply? >>> >>> Thanks, >>> Ira >>> >>> ChangeLog: >>> >>> * doc/invoke.texi (max-stores-to-sink): Document. >>> * params.h (MAX_STORES_TO_SINK): Define. >>> * opts.c (finish_options): Set MAX_STORES_TO_SINK to 0 >>> if either vectorization or if-conversion is disabled. >>> * tree-data-ref.c (dr_equal_offsets_p1): Moved and renamed from >>> tree-vect-data-refs.c vect_equal_offsets. >>> (dr_equal_offsets_p): New function. >>> (find_data_references_in_bb): Remove static. >>> * tree-data-ref.h (find_data_references_in_bb): Declare. >>> (dr_equal_offsets_p): Likewise. >>> * tree-vect-data-refs.c (vect_equal_offsets): Move to tree-data-ref.c. >>> (vect_drs_dependent_in_basic_block): Update calls to vect_equal_offsets. >>> (vect_check_interleaving): Likewise. >>> * tree-ssa-phiopt.c: Include cfgloop.h and tree-data-ref.h. >>> (cond_if_else_store_replacement): Rename to... >>> (cond_if_else_store_replacement_1): ... this. Change arguments and >>> documentation. >>> (cond_if_else_store_replacement): New function. >>> * Makefile.in (tree-ssa-phiopt.o): Adjust dependencies. >>> * params.def (PARAM_MAX_STORES_TO_SINK): Define. >>> >>> testsuite/ChangeLog: >>> >>> * gcc.dg/vect/vect-cselim-1.c: New test. >>> * gcc.dg/vect/vect-cselim-2.c: New test. >>> >> >> This caused: >> >> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48270 > > Yep, I see > > FAIL: gcc.c-torture/execute/builtins/strlen-2.c compilation, -O3 > -fomit-frame-p > ointer (internal compiler error) > > on x86_64-linux with > > #0 0x0000000001f53253 in gimple_uid (g=0x0) > at /space/rguenther/src/svn/trunk/gcc/gimple.h:1297 > 1297 return g->gsbase.uid; > (gdb) > Bottom (innermost) frame selected; you cannot go down. > (gdb) up > #1 0x0000000001f7753d in cond_if_else_store_replacement ( > then_bb=0x7ffff533abc8, else_bb=0x7ffff533ac30, join_bb=0x7ffff533ac98) > at /space/rguenther/src/svn/trunk/gcc/tree-ssa-phiopt.c:1511 > 1511 if (DDR_ARE_DEPENDENT (ddr) != chrec_known > > DR_STMT of dra is NULL. No idea how that can happen.
Maybe because 1495 compute_all_dependences (then_datarefs, &then_ddrs, NULL, false); 1496 compute_all_dependences (else_datarefs, &else_ddrs, NULL, false); 1497 free_data_refs (then_datarefs); 1498 free_data_refs (else_datarefs); frees the data refs too early? Richard.