Thanks for looking at it in detail. > Yeah, I think this is potentially a blocker for propagating A into B > when A is used elsewhere. Combine is able to combine A and B while > keeping A in parallel with the result. I think either fwprop would > need to try that too, or it would need to be restricted to cases where A > is only used in B.
That seems a rather severe limitation and my original use case would not get optimized considerably anymore. The intention was to replace all uses (if register pressure allows). Of course the example is simple enough that a propagation is always useful if the costs allow it, so it might not be representative. I'm wondering if we could (my original misunderstanding) tentatively try to propagate into all uses of a definition and, when reaching a certain ratio, decide that it might be worth it, otherwise revert. Would be very crude though, and not driven by the actual problem we're trying to avoid. > I think the summary is: > > IMO, we have to be mindful that combine is still to run. We need to > avoid making equal-cost changes if the new form is more complex, or > otherwise likely to interfere with combine. I guess we don't have a good measure for complexity or "combinability" and even lower-cost changes could result in worse options later. Would it make sense to have a strict less-than cost policy for those more complex propagations? Or do you consider the approach in its current shape "hopeless", given the complications we discussed? > Alternatively, we could delay the optimisation until after combine > and have freer rein, since we're then just mopping up opportunities > that other passes left behind. > > A while back I was experimenting with a second combine pass. That was > the original motiviation for rtl-ssa. I never got chance to finish it > off though. This doesn't sound like something that would still materialize before the end of stage 1 :) Do you see any way of restricting the current approach to make it less intrusive and still worthwhile? Limiting to vec_duplicate might be much too arbitrary but would still help for my original example. Regards Robin