On 4/14/2021 12:41 AM, Richard Biener wrote:
On Wed, 14 Apr 2021, Xionghu Luo wrote:

Hi,

On 2021/3/26 15:35, Xionghu Luo via Gcc-patches wrote:
Also we already have a sinking pass on RTL which even computes
a proper PRE on the reverse graph - -fgcse-sm aka store-motion.c.
I'm not sure whether this deals with non-stores but the
LCM machinery definitely can handle arbitrary expressions.  I wonder
if it makes more sense to extend this rather than inventing a new
ad-hoc sinking pass?
  From the literal, my pass doesn't handle or process store instructions
like store-motion..  Thanks, will check it.
Store motion only processes store instructions with data flow equations,
generating 4 inputs(st_kill, st_avloc, st_antloc, st_transp) and solve it
by Lazy Code Motion API(5 DF compute call) with 2 outputs (st_delete_map,
st_insert_map) globally, each store place is independently represented in
the input bitmap vectors. Output is which should be delete and where to
insert, current code does what you said "emit copies to a new pseudo at
the original insn location and use it in followed bb", actually it is
"store replacement" instead of "store move", why not save one pseudo by
moving the store instruction to target edge directly?
It probably simply saves the pass from doing analysis whether the
stored value is clobbered on the sinking path, enabling more store
sinking.  For stores that might be even beneficial, for non-stores
it becomes more of a cost issue, yes.

There are many differences between the newly added rtl-sink pass and
store-motion pass.
1. Store motion moves only store instructions, rtl-sink ignores store
instructions.
2. Store motion is a global DF problem solving, rtl-sink only processes
loop header reversely with dependency check in loop, take the below RTL
as example,
"#538,#235,#234,#233" will all be sunk from bb 35 to bb 37 by rtl-sink,
but it moves #538 first, then #235, there is strong dependency here. It
seemsdoesn't like the LCM framework that could solve all and do the
delete-insert in one iteration.
So my question was whether we want to do both within the LCM store
sinking framework.  The LCM dataflow is also used by RTL PRE which
handles both loads and non-loads so in principle it should be able
to handle stores and non-stores for the sinking case (PRE on the
reverse CFG).

A global dataflow is more powerful than any local ad-hoc method.

IIRC you can use LCM on stores like this, but you have to run it independently on each store to pick up the secondary effects.   I believe the basic concepts are discussed in Morgan's book.   That may turn out to be too expensive in practice -- I've never tried it though.

jeff


Reply via email to