https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113910

--- Comment #10 from Richard Biener <rguenth at gcc dot gnu.org> ---
Cutting the switch in decodeToMCInst after case 693: (roughly halving it by the
number of source lines) gets us to

 tree PTA                           : 129.70 ( 92%)   0.51 ( 14%) 130.28 ( 90%)
 2279k (  0%)
 TOTAL                              : 140.28          3.68        144.01       
  982M

a profile shows

Samples: 657K of event 'cycles:u', Event count (approx.): 873340708228          
Overhead       Samples  Command  Shared Object       Symbol                     
  88.08%        578019  cc1plus  cc1plus             [.] bitmap_equal_p
   4.76%         31340  cc1plus  cc1plus             [.]
equiv_class_lookup_or_a
   0.59%          4039  cc1plus  cc1plus             [.] bitmap_set_bit
   0.24%          1611  cc1plus  cc1plus             [.] bitmap_copy

the way we hash bitmaps is quite bad, we effectively hash set and a subset
of unset bits.  Adding a simple additional "hash", the number of set bits,
improves this to

Samples: 214K of event 'cycles:u', Event count (approx.): 283548833048          
Overhead       Samples  Command  Shared Object     Symbol                       
  69.73%        148209  cc1plus  cc1plus           [.] bitmap_equal_p
   6.29%         13499  cc1plus  cc1plus           [.]
equiv_class_lookup_or_add

of course we still have too many calls (or too large but almost equal bitmaps
here).  Still I have a handle on this.

Reply via email to