https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64058
--- Comment #16 from rguenther at suse dot de <rguenther at suse dot de> --- On Thu, 10 Mar 2016, law at redhat dot com wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64058 > > Jeffrey A. Law <law at redhat dot com> changed: > > What |Removed |Added > ---------------------------------------------------------------------------- > CC| |amacleod at redhat dot com > > --- Comment #14 from Jeffrey A. Law <law at redhat dot com> --- > So working on the assumption that full stability for SSA_NAME_VERSION changes > isn't in the cards, I went back to my patch which avoids using > SSA_NAME_VERSION > as the tie-breaker given two identical coalescing costs. Yes, and that's a good change which I think we should put into GCC 7 regardless whether it fixes these PRs or not. > I can verify for this BZ that change brings stability to the order in which we > process coalescing candidates with the same cost. Essentially the tie breaker > becomes which coalescing opportunity is seen first during the coalescer's walk > of the IL. > > That's fine and good, but as we know only addresses half the problem. It does > not address whether or not adjustments need to be made to the costing > mechanism > itself or whether we can come up with a better tie breaker. > > For this BZ, using the referenced trunk versions these are the key coalescings > in the chain. > r216303: > (4004) _1 <-> _316 > (3276) _1 <-> _200 > (3276) u1_lsm.6_253 <-> _316 > > Coalesce list: (1)_1 & (316)_316 [map: 0, 98] : Success -> 0 > Coalesce list: (1)_1 & (200)_200 [map: 0, 46] : Success -> 0 > Coalesce list: (1)_1 & (253)l1_lsm.7_159 [map: 0, 32] : Fail due to conflict > > > The successful coalescings and subsequent merging of conflicts cause the 3rd > coalesce to fail (and others). This is actually what we want for this BZ. > > In r216304 we have: > (4004) _315 <-> _316 > (3276) u1_lsm.6_253 <-> _316 > (3276) _314 <-> _315 > > Coalesce list: (315)_315 & (316)_316 [map: 97, 98] : Success -> 97 > Coalesce list: (253)l1_lsm.7_58 & (316)_315 [map: 4, 97] : Success -> 4 > Coalesce list: (314)_314 & (315)l1_lsm.7_58 [map: 96, 4] : Fail due to > conflict > > > Note that the coalescing pairs with cost 3276 are reversed due to > SSA_NAME_VERSION churn. That allows 253 to coalesce with 316 which in turn > prevents 314 from coalescing with 315 (and other downstream effects). The net > result of those downstream effects is an unwanted copy. > > Looking at the costing metrics, they are strictly based on the edge frequency > and flags where the copy would have to be inserted. > > What jumps out to me is that the costing metrics don't look at all at the > conflicts. When we coalesce two objects, we have to merge their conflicts > into > the partition leader. Essentially each time we coalesce, the resulting > partition leader is going to have a larger conflict list and is less likely to > participate in further coalescing. > > That might argue that peeking at the union of the resulting conflicts and > giving higher priority to the one that has a smaller resulting conflict set > may > be useful in the cost calculation or as a better tie-breaker. > > So going back to r216303 and looking at the original partition conflicts we > have: > > Partition 0 (_1 - 1 ) > Partition 98 (_316 - 316 ) > Partition 46 (_200 - 200 ) > Partition 72 (u1_lsm.6_253 - 253 ) > > Conflict graph: > 0: 4, 9, 28, 35, 41, 42, 44, 53, 60, 66, 75, 81, 84, 90, 93, 96, 100, 102 > > 46: 4, 5, 6, 9, 28, 35, 41, 42, 44, 51, 63, 66, 75, 81, 84, 90, 93, 96, 100, > 102 > > 72: 4, 9, 28, 35, 44, 55, 67, 74, 75, 78, 84, 85, 90, 93, 96, 102, 108, 110, > 111 > > 98: 4, 7, 8, 9, 10, 11, 14, 15, 27, 28, 35, 41, 42, 44, 47, 48, 50, 52, 56, > 57, > 58, 59, 65, 75, 81, 84, 90, 93, 96, 102 > > Coalescing _1 and _200 will result in a node with 22 unique conflicts. > Coalescing _253 and _316 will result in a node with 38 unique conflicts. > > So if we used the conflicts sets either directly in the costing model or for > breaking ties and gave preference to coalescings with smaller resulting > conflicts, we'd get the right code for this test. Yeah, this boils down to getting closer to a global optimal coalescing solution rather than the "1-level" local optimal solution we are computing. But this should be addressed in an algorithmically sound way as generally this is a NP hard problem and we want a "cheap" way for -O[01g] at least as well as for must-coalesces. > The question then becomes whether or not that is good in general. Finding a better solution is certainly good in general (if our local cost computations are accurate). Whether it's algorithmically feasible is another question. I suppose one may find some answers in the theory of SSA based register allocation.