https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110034
--- Comment #4 from Vladimir Makarov <vmakarov at gcc dot gnu.org> --- Thank you for providing the test case. To be honest I don't see why assigning to hr3 to r134 is better. Currently we have the following assignments: hr9->r134; hr3->r173; hr3->r124 and the related preferences: cp11:a18(r134)<->a29(r173)@125:shuffle pref3:a29(r173)<-hr3@2000 pref4:a0(r124)<-hr3@125 This removes cost 2000 (pref3) and cost 125 (pref4) and adds cost 125 (cp11). The profit is 2000 If we started with r173, we would have the following assignments: hr3->r173; hr3->r134; <some hard reg but hr3>->r124 This would remove cost 2000 (pref3) and cost 125 (cp11) and add cost 125 (pref). The profit would be the same 2000. Choice of heuristics is very time consuming. I spent a lot of time to try and benchmark numerous ones. I clearly remember that introduction of pseudo threads for colorable busket gave visible performance improvement. Currently we assign pseudos from a thread with the biggest frequency first (r173 and r134) and a pseudo (r134) with the biggest frequency first from the same thread. I think it is logical. Also it is always possible to find a test (not this case) where heuristics give some undesirable results. RA is NP-complete task even in the simplest formulation. We can not get the optimal solution for reasonable time. Still I am open to change any heuristic if somebody can show that it improves performance for some credible benchmark (I prefer SPEC2007) on major GCC targets.