http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45422

--- Comment #42 from Richard Guenther <rguenth at gcc dot gnu.org> 2011-01-27 
16:30:52 UTC ---
Comparing -O3 -ffast-math -funroll-loops -fno-inline -fno-partial-inlining
(thus generic arch, without prefetching):

trunk:

 df live regs          :   4.22 ( 6%) usr   0.04 ( 2%) sys   4.11 ( 5%) wall   
   0 kB ( 0%) ggc
 tree iv optimization  :   3.92 ( 5%) usr   0.13 ( 5%) sys   4.29 ( 6%) wall  
91066 kB (11%) ggc
 integrated RA         :   5.57 ( 8%) usr   0.10 ( 4%) sys   5.93 ( 8%) wall  
26408 kB ( 3%) ggc
 scheduling 2          :   3.73 ( 5%) usr   0.04 ( 2%) sys   3.85 ( 5%) wall   
 939 kB ( 0%) ggc
 TOTAL                 :  73.68             2.37            76.91            
852775 kB

4.5:

 df live regs          :   4.60 ( 7%) usr   0.02 ( 1%) sys   4.62 ( 6%) wall   
   0 kB ( 0%) ggc
 expand                :   3.94 ( 6%) usr   0.17 ( 8%) sys   3.94 ( 6%) wall  
62218 kB ( 8%) ggc
 integrated RA         :   5.73 ( 8%) usr   0.02 ( 1%) sys   5.76 ( 8%) wall  
22920 kB ( 3%) ggc
 reload                :   3.78 ( 5%) usr   0.08 ( 4%) sys   3.86 ( 5%) wall   
9291 kB ( 1%) ggc
 TOTAL                 :  68.98             2.01            71.22            
828137 kB

it would be nice to confirm that we are indeed much better with
optimizing bounds-checking code.  The prefetching issue is
tracked as PR44688.  So I'd close this either as a dup or as
wontfix (it's a feature that we optimize loops with bounds-checking).

Reply via email to