[Bug tree-optimization/49851] IVOPTs makes a mess out of polyhedron air derivx and derivy

2011-07-26 Thread rguenth at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49851

--- Comment #1 from Richard Guenther rguenth at gcc dot gnu.org 2011-07-26 
12:53:47 UTC ---
Testcase with caller, use -fwhole-program to force inlining.

  IMPLICIT REAL*8(a-H,O-Z)
  PARAMETER (NX=150,NY=150)
  DIMENSION ux(NX,NY) , uy(NX,NY) , vx(NX,NY) , vy(NX,NY) , 
   tx(NX,NY)
  DIMENSION DX(NX,33) , DY(NY,33)
  DIMENSION U(NX,NY) , V(NX,NY) , P(NX,NY) , RHO(NX,NY) , E(NX,NY)
  DIMENSION NPX(30) , x(NX) , NPY(30) , y(NY) , ALX(30) , bex(30)
  COMMON /XD1   / FP1 , FM1 , FP2 , FM2 , FP3 , FM3 , FP4 , FM4 ,   
 FP1x , FM1x , FP2x , FM2x , FP3x , FM3x , FP4x ,  
 FM4x , FV2 , FV3 , FV4 , DXP2 , DXM2 , DXP3 , 
 DXM3 , DXP4 , DXM4 , DX , NPX , ALX , NDX , MXPy
  COMMON /BNDRY / U , V , P , RHO , T , E , AS1 , AS2 , AS3 , AS4 , 
 NX1 , NY1 , NX2 , NY2 , SIG , GMA , S0 , T0 , P0 ,
 PE , HOO , RR , MINlet


  CALL DERIVX(DX,U,ux,ALX,NPX,NDX,MXPy)

  END

  SUBROUTINE DERIVX(D,U,Ux,Al,Np,Nd,M)
  IMPLICIT REAL*8(A-H,O-Z)
  PARAMETER (NX=150,NY=150)
  DIMENSION D(NX,33) , U(NX,NY) , Ux(NX,NY) , Al(30) , Np(30)
  DO jm = 1 , M
 jmax = 0
 jmin = 1
 DO i = 1 , Nd
jmax = jmax + Np(i) + 1
DO j = jmin , jmax
   uxt = 0.
   DO k = 0 , Np(i)
  uxt = uxt + D(j,k+1)*U(jmin+k,jm)
   ENDDO
   Ux(j,jm) = uxt*Al(i)
ENDDO
jmin = jmin + Np(i) + 1
 ENDDO
  ENDDO
  CONTINUE
  END


[Bug tree-optimization/49851] IVOPTs makes a mess out of polyhedron air derivx and derivy

2011-07-26 Thread rguenth at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49851

--- Comment #2 from Richard Guenther rguenth at gcc dot gnu.org 2011-07-26 
13:59:26 UTC ---
AIR spends 86% of its time in DERIV[XY] (for ICC), 78% of its time there for
GCC.
The performance difference also reproduces when not inlining DERIV[XY] at all
(though it's slightly less of a difference - GCC doesn't care).


[Bug tree-optimization/49851] IVOPTs makes a mess out of polyhedron air derivx and derivy

2011-07-26 Thread rguenth at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49851

--- Comment #3 from Richard Guenther rguenth at gcc dot gnu.org 2011-07-26 
14:32:35 UTC ---
(In reply to comment #2)
 AIR spends 86% of its time in DERIV[XY] (for ICC), 78% of its time there for
 GCC.
 The performance difference also reproduces when not inlining DERIV[XY] at all
 (though it's slightly less of a difference - GCC doesn't care).

Actually it does not.  Without inlining:

GCC:   air  2.42 4728556  3.99  10  0.1798
  SUBROUTINE DERIVX(D,U,Ux,Al,Np,Nd,M) /* derivx_ total: 8194999 42.3750 */
  SUBROUTINE DERIVY(D,U,Uy,Al,Np,Nd,M) /* derivy_ total: 8176250 42.2781 */

ICC:   air  2.90 4072563  3.45  10  0.1809
  SUBROUTINE DERIVX(D,U,Ux,Al,Np,Nd,M) /* derivx_ total: 8060834 47.2620 */
  SUBROUTINE DERIVY(D,U,Uy,Al,Np,Nd,M) /* derivy_ total: 7070627 41.4563 */

so not much difference in the total hits.  Which means ICC performs some
context-dependent optimization.