Re: [PATCH 0/4] Improve DSE implementation

Jeff Law Fri, 16 Dec 2016 10:03:54 -0800

On 12/16/2016 06:32 AM, Richard Biener wrote:

On Fri, Dec 16, 2016 at 2:54 AM, Jeff Law <l...@redhat.com> wrote:

This is a 4 part patchkit to address various deficiencies in our DSE
implementation.


BZ33562 was the inspiration for this work.  33562 is a low priority
regression that's been around for a long time.  Patch #1 addresses 33562,
"aggregate DSE disabled" and also implements trimming of complex assignment
when just one half of it is dead.

The discussions last year with Richi, reviewing of bugs in both LLVM and
GCC's databases and code instrumentation resulted in patches 2-4.

Patch #2 implements trimming of CONSTRUCTOR initializations.  This is
61912/77485.  This gets the most static hits of all the improvements.

Patch #3 implements trimming of mem* calls.  We trim from the front or back
of the store.    This doesn't hit as much as #2, but still happens quite
often.  There is no BZ for this deficiency.

Patch #4 adds the ability to look through loads which may read from the same
memory as the potentially dead store, but which can be proven only read from
currently dead bytes within the object.  This hits just once in the compiler
& runtime libraries.  But it does hit often in the libstdc++ testsuite.
There is no BZ for this deficiency.


There's dependencies as we walk forward in the patch kits.  Each patch has
been bootstrapped & tested with its previous patch(es).

There is much more that could be done beyond the series of 4 patches in this
patchkit.  Richi has pointed out that SRA and DSE could probably share a lot
of analysis and transformation code.  There may even be advantages to having
the two optimizations integrated into a single pass.  I haven't investigated
any of that yet (though we are using a bit of code from SRA in this kit).


Yes, see also PR78821, similar other passes are bswap detection and
store merging.  Basically all passes that find "related" loads/stores.
I suppose
rather than fully merging the passes but trying to share some analysis code
and data structures would be a good start (though at the same time you'd
probably rewrite most of those passes).

Certainly the first step would be to share analysis, data structures andlikely some manipulation routines.

We also need to look at store sinking again.  I saw a patch from Richi back
in July looked reasonable at a high level and would likely allow resolution
of a multiple BZs.


Yeah, it had some fallout / pass ordering issue and I never went back to
it (being also a very simplistic implementation).

Figured it was something like that.

ISTM that store sinking isn't radically different than other cases wherewe want to sink through PHIs. ISTM that a single driver which couldsink stores, arithmetic/logicals, whatever would be in order. It'd be aworklist algorithm over PHI nodes since sinking stores through a PHI mayin turn allow it to sink further.

The question in my mind for stores is do we allow the addresses to varyor just the object stored? We could actually support both in many cases.


Jeff

Re: [PATCH 0/4] Improve DSE implementation

Reply via email to