On September 6, 2016 5:14:47 PM GMT+02:00, Kyrill Tkachov 
<kyrylo.tkac...@foss.arm.com> wrote:
>Hi all,

s/contigous/contiguous/
s/ where where/ where/

+struct merged_store_group
+{
+  HOST_WIDE_INT start;
+  HOST_WIDE_INT width;
+  unsigned char *val;
+  unsigned int align;
+  auto_vec<struct store_immediate_info *> stores;
+  /* We record the first and last original statements in the sequence because
+     because we'll need their vuse/vdef and replacement position.  */
+  gimple *last_stmt;

s/ because because/ because/

Why aren't these two HWIs unsigned, likewise in store_immediate_info and in 
most other spots in the patch?

+         fprintf (dump_file, "Afer writing ");
s/Afer /After/

/access if prohibitively slow/s/ if /is /

I'd get rid of successful_p in imm_store_chain_info::output_merged_stores.


+unsigned int
+pass_store_merging::execute (function *fun)
+{
+  basic_block bb;
+  hash_set<gimple *> orig_stmts;
+
+  FOR_EACH_BB_FN (bb, fun)
+    {
+      gimple_stmt_iterator gsi;
+      HOST_WIDE_INT num_statements = 0;
+      /* Record the original statements so that we can keep track of
+        statements emitted in this pass and not re-process new
+        statements.  */
+      for (gsi = gsi_after_labels (bb); !gsi_end_p (gsi); gsi_next (&gsi))
+       {
+         gimple_set_visited (gsi_stmt (gsi), false);
+         num_statements++;
+       }
+
+      if (num_statements < 2)
+       continue;

What about debug statements? ISTM you should skip those.
(Isn't visited reset before entry of a pass?)

Maybe I missed the bikeshedding about the name but I'd have used -fmerge-stores 
instead.

Thanks,
>
>The v3 of this patch addresses feedback I received on the version
>posted at [1].
>The merged store buffer is now represented as a char array that we
>splat values onto with
>native_encode_expr and native_interpret_expr. This allows us to merge
>anything that native_encode_expr
>accepts, including floating point values and short vectors. So this
>version extends the functionality
>of the previous one in that it handles floating point values as well.
>
>The first phase of the algorithm that detects the contiguous stores is
>also slightly refactored according
>to feedback to read more fluently.
>
>Richi, I experimented with merging up to MOVE_MAX bytes rather than
>word size but I got worse results on aarch64.
>MOVE_MAX there is 16 (because it has load/store register pair
>instructions) but the 128-bit immediates that we ended
>synthesising were too complex. Perhaps the TImode immediate store RTL
>expansions could be improved, but for now
>I've left the maximum merge size to be BITS_PER_WORD.
>
>I've disabled the pass for PDP-endian targets as the merging code
>proved to be quite fiddly to get right for different
>endiannesses and I didn't feel comfortable writing logic for
>BYTES_BIG_ENDIAN != WORDS_BIG_ENDIAN targets without serious
>testing capabilities. I hope that's ok (I note the bswap pass also
>doesn't try to do anything on such targets).
>
>Tested on arm, aarch64, x86_64 and on big-endian arm and aarch64.
>
>How does this version look?
>Thanks,
>Kyrill
>
>[1] https://gcc.gnu.org/ml/gcc-patches/2016-08/msg01512.html
>
>2016-09-06  Kyrylo Tkachov  <kyrylo.tkac...@arm.com>
>
>     PR middle-end/22141
>     * Makefile.in (OBJS): Add gimple-ssa-store-merging.o.
>     * common.opt (fstore-merging): New Optimization option.
>     * opts.c (default_options_table): Add entry for
>     OPT_ftree_store_merging.
>     * params.def (PARAM_STORE_MERGING_ALLOW_UNALIGNED): Define.
>     * passes.def: Insert pass_tree_store_merging.
>     * tree-pass.h (make_pass_store_merging): Declare extern
>     prototype.
>     * gimple-ssa-store-merging.c: New file.
>     * doc/invoke.texi (Optimization Options): Document
>     -fstore-merging.
>
>2016-09-06  Kyrylo Tkachov  <kyrylo.tkac...@arm.com>
>             Jakub Jelinek  <ja...@redhat.com>
>
>     PR middle-end/22141
>     * gcc.c-torture/execute/pr22141-1.c: New test.
>     * gcc.c-torture/execute/pr22141-2.c: Likewise.
>     * gcc.target/aarch64/ldp_stp_1.c: Adjust for -fstore-merging.
>     * gcc.target/aarch64/ldp_stp_4.c: Likewise.
>     * gcc.dg/store_merging_1.c: New test.
>     * gcc.dg/store_merging_2.c: Likewise.
>     * gcc.dg/store_merging_3.c: Likewise.
>     * gcc.dg/store_merging_4.c: Likewise.
>     * gcc.dg/store_merging_5.c: Likewise.
>     * gcc.dg/store_merging_6.c: Likewise.
>     * gcc.target/i386/pr22141.c: Likewise.
>     * gcc.target/i386/pr34012.c: Add -fno-store-merging to dg-options.


Reply via email to