http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54146
--- Comment #60 from Steven Bosscher <steven at gcc dot gnu.org> 2012-09-04 10:50:47 UTC --- Another improvement that can be done for the test case, is to assign pseudo-register numbers such that the density of the DF_LR and DF_LIVE bitmaps increases. For the density I used the following definition here: size = (# list elts) * (# bits per elt) cardinality = (# bits set) density = cardinality / size The average density we achieve for the DF_LR_IN sets of the largest function is only 12%, and that number is misleading because there are a few elements with almost all bits set and many elements with only a few bits set. For instance there is one bitmap with the following statistics: size = 50 cardinality = 940 density = 14.7% but the per-element statistics show: bits/elt count 1 20 2 19 3 2 4 2 124 7 so excluding the 7 dense elements (124 bits set out of 128), the density is only 1.3%! The dense bits are for a bunch of SRA variables, the sparse ones for ivtmp variables. The ivtmp variables are live in mostly the same blocks as the SRA variables (>100000 blocks where the variables appear in DF_LR_IN of the block) so the packing is extremely inefficient. This is the cause for most of the slowness of DF on the test case: the bitmap chains are large and extremely sparse.