https://gcc.gnu.org/bugzilla/show_bug.cgi?id=30409

--- Comment #7 from kargl at gcc dot gnu.org ---
The attached testcase use xmin and xmax uninitialized.
After setting xmin = 0 and xmax = 1 and adding z(1) to 
the print statements to prevent the inner loop from 
being optimized away,  I see the following:

% gfcx -o z -O0 a.f90 && ./z
 time 1:    1.78299993E-02   7249751.00    
 time 2:    6.37416887       7249751.00    
% gfcx -o z -O1 a.f90 && ./z
 time 1:    1.37590002E-02   7249751.00    
 time 2:    6.36764479       7249751.00    
% gfcx -o z -O2 a.f90 && ./z
 time 1:    1.23690004E-02   7249751.00    
 time 2:    1.85729897       7249751.00    
% gfcx -o z -O3 a.f90 && ./z
 time 1:    2.43199989E-03   7249751.00    
 time 2:    1.85660207       7249751.00    
% gfcx -o z -Ofast a.f90 && ./z
 time 1:    3.63499997E-03   7249751.50    
 time 2:   0.621210992       7249751.50    

so the timing improves with optimization.  -fdump-tree-original still
shows the generation of a temporary variable for the actual argument
1/y in the second set of nested loops.  -fdump-tree-optimized is 
fairly difficult for me to decipher, but it appears that the 1/y
is not hoisted out of the inner loop.

Reply via email to