Thanks for looking at it in detail.

> Yeah, I think this is potentially a blocker for propagating A into B
> when A is used elsewhere.  Combine is able to combine A and B while
> keeping A in parallel with the result.  I think either fwprop would
> need to try that too, or it would need to be restricted to cases where A
> is only used in B.

That seems a rather severe limitation and my original use case would
not get optimized considerably anymore.  The intention was to replace
all uses (if register pressure allows).  Of course the example is simple
enough that a propagation is always useful if the costs allow it, so
it might not be representative.

I'm wondering if we could (my original misunderstanding) tentatively
try to propagate into all uses of a definition and, when reaching
a certain ratio, decide that it might be worth it, otherwise revert.
Would be very crude though, and not driven by the actual problem we're
trying to avoid. 

> I think the summary is:
> 
> IMO, we have to be mindful that combine is still to run.  We need to
> avoid making equal-cost changes if the new form is more complex, or
> otherwise likely to interfere with combine.

I guess we don't have a good measure for complexity or "combinability"
and even lower-cost changes could result in worse options later.
Would it make sense to have a strict less-than cost policy for those
more complex propagations?  Or do you consider the approach in its
current shape "hopeless", given the complications we discussed?

> Alternatively, we could delay the optimisation until after combine
> and have freer rein, since we're then just mopping up opportunities
> that other passes left behind.
> 
> A while back I was experimenting with a second combine pass.  That was
> the original motiviation for rtl-ssa.  I never got chance to finish it
> off though.

This doesn't sound like something that would still materialize before
the end of stage 1 :)
Do you see any way of restricting the current approach to make it less
intrusive and still worthwhile?  Limiting to vec_duplicate might be
much too arbitrary but would still help for my original example.

Regards
 Robin

Reply via email to