On Wed, May 6, 2015 at 4:04 PM, Yuri Rumyantsev <ysrum...@gmail.com> wrote:
> Hi All,
>
> Here is a patch which gives us significant speed-up on HASWELL for
> test containing masked stores. The main goal of that patch is attempt
> to avoid HW hazard for maskmove instructions through inserting
> additional check on zero mask and putting all masked store statements
> into separate block on false edge.All MASK_STORE statements having the
> same mask put into one block. Any comments will be appreciate.

Hmm.  I'm not very happy with this "optimization" happening at the
GIMPLE level - it feels more like a mdreorg thing...

The testcase you add doesn't end up with invalid addresses - so what's
the testcase you are inventing this for?

Looking into the implementation I don't see where you are validating
data dependences of any sort but you are moving stores (and possibly
loads when sinking definition stmts of stored values).  The code-sinking
part should be handled by the existing pass.  Your simple testcase
contains a single masked store, so why does simply conditionalizing
each masked store in mdreorg not work?  It's a hazard (hopefully
fixed eventually), thus not really worth optimizing 100%.

The target hook name is awful.

You don't need a extra flag in struct loop - the vectorizer scans all
insns so it can perfectly well re-compute it.

What this all feels like is more like a un-if-conversion pass which might
be useful for aggressively if-converted vectorized code as well (thus
lots of vec_cond expressions for example).

Richard.

> ChangeLog:
> 2015-05-06  Yuri Rumyantsev  <ysrum...@gmail.com>
>
> * cfgloop.h (has_mask_store): Add new field to struct loop.
> * config/i386/i386.c: Include files stringpool.h and tree-ssanames.h.
> (ix86_vectorize_zero_vector): New function.
> (TARGET_VECTORIZE_ZERO_VECTOR): New target macro
> * doc/tm.texi.in: Add @hook TARGET_VECTORIZE_ZERO_VECTOR.
> * doc/tm.texi: Updated.
> * target.def (zero_vector): New DEFHOOK.
> * tree-if-conv.c (predicate_mem_writes): Set has_mask_store for loop.
> * tree-vect-stmts.c : Include tree-into-ssa.h.
> (optimize_mask_stores): New function.
> * tree-vectorizer.c (vectorize_loops): Zero has_mask_store field for
> non-vectorized loops and invoke optimize_mask_stores function.
>
> gcc/testsuite/ChangeLog:
> * gcc.target/i386/avx2-vect-mask-store-move1.c: New test.

Reply via email to