https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109237

--- Comment #9 from Richard Biener <rguenth at gcc dot gnu.org> ---
Samples: 289K of event 'cycles:u', Event count (approx.): 384226334976          
Overhead       Samples  Command  Shared Object     Symbol                       
   3.52%          9747  cc1      cc1               [.] bb_is_just_return       
           #
   2.94%          8241  cc1      cc1               [.] df_note_compute         
           #
   2.92%          8085  cc1      cc1               [.] init_alias_analysis     
           #
   2.55%          7035  cc1      cc1               [.]
delete_trivially_dead_insns         #
   2.28%          6372  cc1      cc1               [.]
contains_no_active_insn_p           #
   2.16%          6288  cc1      cc1               [.] get_ref_base_and_extent 
           #
   2.02%          5785  cc1      cc1               [.] ggc_set_mark            
           #
   1.55%          4308  cc1      cc1               [.] fast_dce                
           #

I see that bb_is_just_return is high in the profile and looking at its
implementation I wonder whether on RTL we can scan insns backwards and
stop if the last (real?) insn isn't ANY_RETURN_P ()?  Using
FOR_BB_INSNS_REVERSE puts it off the profile completely.  Will test a patch.

Similar contains_no_active_insn_p is high up in the profile and it looks
like micro-optimizing it a bit would help.  Using NONDEBUG_INSN_P to
guard the flow_active_insn_p call doesn't seem to help (but perf is always
noisy).

Reply via email to