https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111241

--- Comment #3 from Richard Biener <rguenth at gcc dot gnu.org> ---
Samples: 121K of event 'cycles:u', Event count (approx.): 159788164341          
Overhead       Samples  Command  Shared Object       Symbol                     
  21.24%         25791  cc1      cc1                 [.]
get_ref_base_and_exten#
   7.75%          9154  cc1      libc-2.31.so        [.] __memset_avx512_erms 
#
   5.61%          7072  cc1      cc1                 [.] dominated_by_p       
#
   2.43%          2936  cc1      cc1                 [.] bitmap_set_bit       
#
   2.38%          3000  cc1      cc1                 [.] dominated_by_p_w_unex
#
   1.76%          2154  cc1      cc1                 [.] find_base_term       
#
   1.75%          2148  cc1      cc1                 [.] ix86_find_base_term  
#
   1.41%          1656  cc1      cc1                 [.]
df_reorganize_refs_by_#

all usual suspects are present ... :/

The memset and df_reorganize_refs_by_defs are the known bug that RTL invariant
motion does work O(function-size) * O(number-of-loops) through df_analyze_loop
because reorganize-refs processes all function refs, not only loop refs
(difficult to fix).

For get_ref_base_and_extent we have ~2 array-refs per call and array-ref
processing is expensive (array_ref_element_size, but also
wi::lshift_large).  The most expensive calls are from vn_reference_lookup done
during elimination looking for redundant stores (that's odd), possibly
because it enables VN_WALKREWRITE unconditionally, for -O2 that's also
the default though.

Reply via email to