http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57706

            Bug ID: 57706
           Summary: LRA is bottleneck while compiling LTO firefox
           Product: gcc
           Version: 4.9.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: rtl-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: hubicka at gcc dot gnu.org

One of ltrans partitions wihle building firefox gets stuck with the following
profile:
CPU: AMD64 family10, speed 2100 MHz (estimated)
Counted CPU_CLK_UNHALTED events (Cycles outside of halt state) with a unit mask
of 0x00 (No unit mask) count 750000
samples  %        image name               app name                 symbol name
84432    27.1889  lto1                     lto1                    
ggc_internal_alloc_stat(unsigned long)
5490      1.7679  libc-2.11.1.so           libc-2.11.1.so           _int_malloc
4746      1.5283  lto1                     lto1                    
bitmap_set_bit(bitmap_head_def*, int)
4155      1.3380  libc-2.11.1.so           libc-2.11.1.so           memset
3190      1.0272  lto1                     lto1                    
hash_table_mod1(unsigned int, unsigned int)
3029      0.9754  lto1                     lto1                    
for_each_rtx_1(rtx_def*, int, int (*)(rtx_def**, void*), void*)
2860      0.9210  lto1                     lto1                    
bitmap_bit_p(bitmap_head_def*, int)
2325      0.7487  lto1                     lto1                    
df_note_compute(bitmap_head_def*)
2173      0.6998  as                       as                       hash_lookup
2102      0.6769  lto1                     lto1                    
record_reg_classes(int, int, rtx_def**, machine_mode*, char const**, rtx_def*,
reg_class*)
1859      0.5986  lto1                     lto1                    
constrain_operands(int)
1804      0.5809  lto1                     lto1                    
hash_table<variable_hasher, xcallocator>::find_slot_with_hash(void const*,
unsigned int, insert_option)
1674      0.5391  libc-2.11.1.so           libc-2.11.1.so           malloc
1660      0.5346  lto1                     lto1                    
operand_equal_p(tree_node const*, tree_node const*, unsigned int)
1653      0.5323  lto1                     lto1                    
htab_find_slot_with_hash
1543      0.4969  libc-2.11.1.so           libc-2.11.1.so           _int_free
1538      0.4953  lto1                     lto1                    
get_attr_enabled(rtx_def*)
1511      0.4866  lto1                     lto1                    
mem_attrs_eq_p(mem_attrs const*, mem_attrs const*)
1376      0.4431  libc-2.11.1.so           libc-2.11.1.so          
malloc_consolidate

 integrated RA           :  57.28 (11%) usr   0.21 ( 3%) sys  57.51 (11%) wall 
382450 kB (106%) ggc
 LRA non-specific        :   5.35 ( 1%) usr   0.02 ( 0%) sys   5.43 ( 1%) wall 
 24447 kB ( 7%) ggc
 LRA virtuals elimination:   0.35 ( 0%) usr   0.01 ( 0%) sys   0.35 ( 0%) wall 
  8263 kB ( 2%) ggc
 LRA reload inheritance  :   0.64 ( 0%) usr   0.01 ( 0%) sys   0.78 ( 0%) wall 
 11556 kB ( 3%) ggc
 LRA create live ranges  :   1.11 ( 0%) usr   0.00 ( 0%) sys   0.89 ( 0%) wall 
  2973 kB ( 1%) ggc
 LRA hard reg assignment : 166.89 (33%) usr   0.03 ( 0%) sys 166.96 (33%) wall 
     0 kB ( 0%) ggc
 LRA coalesce pseudo regs:   0.02 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall 
     0 kB ( 0%) ggc
 reload                  :   0.13 ( 0%) usr   0.01 ( 0%) sys   0.18 ( 0%) wall 
     0 kB ( 0%) ggc
 reload CSE regs         :  10.24 ( 2%) usr   0.04 ( 1%) sys  10.31 ( 2%) wall 
 51758 kB (14%) ggc
 load CSE after reload   :   2.02 ( 0%) usr   0.01 ( 0%) sys   2.10 ( 0%) wall 
   185 kB ( 0%) ggc
 ree                     :   0.21 ( 0%) usr   0.02 ( 0%) sys   0.19 ( 0%) wall 
   696 kB ( 0%) ggc
 thread pro- & epilogue  :   0.78 ( 0%) usr   0.00 ( 0%) sys   0.76 ( 0%) wall 
 21050 kB ( 6%) ggc
 if-conversion 2         :   0.10 ( 0%) usr   0.02 ( 0%) sys   0.16 ( 0%) wall 
   214 kB ( 0%) ggc
 combine stack adjustments:   0.13 ( 0%) usr   0.02 ( 0%) sys   0.14 ( 0%) wall
      0 kB ( 0%) ggc
 peephole 2              :   0.77 ( 0%) usr   0.01 ( 0%) sys   0.70 ( 0%) wall 
  2982 kB ( 1%) ggc
 rename registers        :   3.87 ( 1%) usr   0.00 ( 0%) sys   3.55 ( 1%) wall 
 16083 kB ( 4%) ggc
 hard reg cprop          :   1.61 ( 0%) usr   0.01 ( 0%) sys   1.61 ( 0%) wall 
   821 kB ( 0%) ggc
 scheduling 2            :  11.50 ( 2%) usr   0.03 ( 0%) sys  11.47 ( 2%) wall 
 15888 kB ( 4%) ggc
 machine dep reorg       :   1.81 ( 0%) usr   0.01 ( 0%) sys   1.71 ( 0%) wall 
   590 kB ( 0%) ggc
 reorder blocks          :   1.26 ( 0%) usr   0.03 ( 0%) sys   1.12 ( 0%) wall 
 15841 kB ( 4%) ggc
 shorten branches        :   0.96 ( 0%) usr   0.00 ( 0%) sys   1.13 ( 0%) wall 
     0 kB ( 0%) ggc
 reg stack               :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall 
    69 kB ( 0%) ggc
 final                   :   6.98 ( 1%) usr   0.46 ( 7%) sys   7.09 ( 1%) wall 
129826 kB (36%) ggc
 variable output         :   0.11 ( 0%) usr   0.00 ( 0%) sys   0.17 ( 0%) wall 
   669 kB ( 0%) ggc
 symout                  :  15.92 ( 3%) usr   0.14 ( 2%) sys  26.70 ( 5%) wall 
406238 kB (113%) ggc
 variable tracking       :  14.50 ( 3%) usr   0.03 ( 0%) sys  14.71 ( 3%) wall 
103487 kB (29%) ggc
 var-tracking dataflow   :  11.07 ( 2%) usr   0.01 ( 0%) sys  10.80 ( 2%) wall 
  2108 kB ( 1%) ggc
 var-tracking emit       :   9.11 ( 2%) usr   0.02 ( 0%) sys   9.26 ( 2%) wall 
119939 kB (33%) ggc
 tree if-combine         :   0.06 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall 
    66 kB ( 0%) ggc
 straight-line strength reduction:   0.34 ( 0%) usr   0.01 ( 0%) sys   0.27 (
0%) wall    1583 kB ( 0%) ggc
 unaccounted optimizations:   0.00 ( 0%) usr   0.01 ( 0%) sys   0.00 ( 0%) wall
      0 kB ( 0%) ggc
 rest of compilation     :   4.49 ( 1%) usr   1.21 (17%) sys   5.41 ( 1%) wall 
 56815 kB (16%) ggc
 remove unused locals    :   0.33 ( 0%) usr   0.00 ( 0%) sys   0.37 ( 0%) wall 
    17 kB ( 0%) ggc
 address taken           :   0.23 ( 0%) usr   0.00 ( 0%) sys   0.27 ( 0%) wall 
     3 kB ( 0%) ggc
 unaccounted todo        :   2.71 ( 1%) usr   0.42 ( 6%) sys   3.19 ( 1%) wall 
   225 kB ( 0%) ggc
 rebuild frequencies     :   0.15 ( 0%) usr   0.00 ( 0%) sys   0.15 ( 0%) wall 
    18 kB ( 0%) ggc
 repair loop structures  :   0.08 ( 0%) usr   0.00 ( 0%) sys   0.06 ( 0%) wall 
     0 kB ( 0%) ggc
 TOTAL                 : 499.43             7.04           512.56            
360511 kB

Reply via email to