On Wed, Aug 3, 2016 at 5:15 PM, Kyrill Tkachov <kyrylo.tkac...@foss.arm.com> wrote: > Hi Richard, > > On 18/07/16 13:22, Richard Biener wrote: >> >> On Fri, Jul 15, 2016 at 5:13 PM, Kyrill Tkachov >> <kyrylo.tkac...@foss.arm.com> wrote: > > > <snip> > > >> + /* Record the original statements so that we can keep track of >> + statements emitted in this pass and not re-process new >> + statements. */ >> + for (gsi = gsi_after_labels (bb); !gsi_end_p (gsi); gsi_next >> (&gsi)) >> + { >> + gimple *stmt = gsi_stmt (gsi); >> + if (!is_gimple_debug (stmt)) >> + orig_stmts.add (stmt); >> + num_statements++; >> + } >> >> please use gimple_set_visited () instead, that should be cheaper. >> >> >> + do >> + { >> + changes_made = false; >> + for (gsi = gsi_after_labels (bb); !gsi_end_p (gsi); gsi_next >> (&gsi)) >> + { >> ... >> + } >> + while (changes_made); >> >> looks pretty quadratic to me. Instead of tracking things with >> m_curr_base_expr >> why not use a hash-map to track stores related to a base? >> >> + /* Don't handle bitfields that are not a >> multiple >> + of BITS_PER_UNIT for now. Can be extended >> + later. */ >> + && (bitsize % BITS_PER_UNIT == 0) >> >> :( >> >> + && !stmt_interferes_with_mem_accesses_p (lhs)) >> >> given this function loops over all accesses this is quadratic as well. >> >> I think alias queries could be simplified if you reduce them to alias >> checks on the base object (and allow overlapping constant stores >> which should be easy to handle during merging). > > > I've implemented that and it simplified the detecting code as well as its > complexity > but I'm now missing some cases that were being caught before. > An example is: > struct bar { > int a; > char b; > char c; > char d; > char e; > char g; > }; > > void > foo1 (struct bar *p, char tmp) > { > p->a = 0; > p->b = 0; > p->g = tmp; > p->c = 0; > p->d = 0; > p->e = 0; > } > > The store to 'g' doesn't interfere with the contiguous stores to the early > fields but because > we perform the aliasing checks on the base object 'p' rather than the full > LHS of the assignments > this is deemed to alias the constant stores and only the first two and the > last three constant stores > are merged instead of the full 5 stores. > > I'll experiment with some solutions for this involving recording the > non-constant stores and performing > some trickery during the merging phase.
Not sure how/where exactly you perform the alias checks but alias checks inbetween a group of same bases should use the full reference to also factor in offsets/sizes. Only cross-group I'd resort to base-only alias-checks. Richard. > Thanks, > Kyrill > > >> The VUSE/VDEF handling is somewhat odd. All stores have both >> a VDEF and a VUSE, if you merge a set of them you can simply >> re-use the VDEF/VUSE of one of them, effectively replacing the >> stmt. For all other stores you remove you miss a >> unlink_stmt_vdef (stmt); >> release_defs (stmt); >> to update virtual SSA form and properly release SSA names. >> >> As you use TBAA in your alias checks you may only _sink_ >> stores. Not sure if your merged store placement ensures this. >> >> I think this kind of transforms is useful in early optimizations, not only >> very late as you perform it. Of course it should be really cheap there. >> >> Note that options like -ftree-store-widening are discouraged >> ("tree" does mean nothing to our users and store widening isn't >> what is done - it's store merging). Simply name it -fstore-merging. >> Also the file name should be gimple-ssa-store-merging.c >> >> Looking forward to an algorithmically enhanced version. >> >> Richard. >> >> >>> Thanks, >>> Kyrill >>> >>> N.B. I'm going on vacation until August so I won't be able to respond to >>> any >>> feedback until I get back. >>> >>> [1] https://gcc.gnu.org/ml/gcc-patches/2009-09/msg01745.html >>> >>> 2016-07-15 Kyrylo Tkachov <kyrylo.tkac...@arm.com> >>> >>> PR middle-end/22141 >>> * Makefile.in (OBJS): Add tree-ssa-store-widening.o. >>> * common.opt (ftree-store-widening): New Optimization option. >>> * opts.c (default_options_table): Add entry for >>> OPT_ftree_store_widening. >>> * params.def (PARAM_STORE_WIDENING_ALLOW_UNALIGNED): Define. >>> * passes.def: Insert pass_tree_store_widening. >>> * tree-pass.h (make_pass_tree_store_widening): Declare extern >>> prototype. >>> * tree-ssa-store-widening.c: New file. >>> * doc/invoke.texi (Optimization Options): Document >>> -ftree-store-widening. >>> >>> 2016-07-15 Kyrylo Tkachov <kyrylo.tkac...@arm.com> >>> Jakub Jelinek <ja...@redhat.com> >>> >>> PR middle-end/22141 >>> * gcc.c-torture/execute/pr22141-1.c: New test. >>> * gcc.c-torture/execute/pr22141-2.c: Likewise. >>> * gcc.target/aarch64/ldp_stp_1.c: Adjust for -ftree-store-widening. >>> * gcc.dg/store_widening_1.c: New test. >>> * gcc.dg/store_widening_2.c: Likewise. >>> * gcc.dg/store_widening_3.c: Likewise. >>> * gcc.dg/store_widening_4.c: Likewise. >>> * gcc.target/i386/pr22141.c: Likewise. > >