[Bug target/34163] 10% performance regression since Nov 1 on Polyhedron's NF on AMD64

2008-04-25 Thread ubizjak at gmail dot com
--- Comment #8 from ubizjak at gmail dot com 2008-04-25 09:55 --- The problem is indeed in trisolve: subroutine trisolve(x,i1,i2) integer :: i1 , i2 real(dpkind),dimension(i2)::x integer :: i x(i1) = gi(i1)* x(i1) do i = i1+1 , i2 x(i) = gi(i)*(x(i)-au1(i-1)*x(i-1)) enddo do i =

[Bug target/34163] 10% performance regression since Nov 1 on Polyhedron's NF on AMD64

2008-04-25 Thread rguenth at gcc dot gnu dot org
--- Comment #9 from rguenth at gcc dot gnu dot org 2008-04-25 10:23 --- Not hoisting the load from x(i) is a missed PRE opportunity. Complete testcase for the second loop: subroutine trisolve2(x,i1,i2,nxyz) integer :: nxyz real,dimension(nxyz):: au1 real,allocatable,dimension(:) :: gi

[Bug target/34163] 10% performance regression since Nov 1 on Polyhedron's NF on AMD64

2008-04-25 Thread ubizjak at gmail dot com
--- Comment #10 from ubizjak at gmail dot com 2008-04-25 11:07 --- (In reply to comment #9) Not hoisting the load from x(i) is a missed PRE opportunity. Complete testcase for the second loop: This is actually the first loop. Just for reference: -O2 -funroll-loops flags are

[Bug target/34163] 10% performance regression since Nov 1 on Polyhedron's NF on AMD64

2008-04-24 Thread ubizjak at gmail dot com
--- Comment #7 from ubizjak at gmail dot com 2008-04-24 19:56 --- Created an attachment (id=15527) -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=15527action=view) x86_64 asm dump of trisolve procedure (genereated without the patch) All the difference is in trisolve procedure

[Bug target/34163] 10% performance regression since Nov 1 on Polyhedron's NF on AMD64

2008-04-22 Thread ubizjak at gmail dot com
--- Comment #3 from ubizjak at gmail dot com 2008-04-22 16:43 --- (In reply to comment #2) Well, this bug needs proper analysis and a testcase, but yes, I also see this slowdown. Richi, the only difference in generated code is by backing out your patch [1] [1]

[Bug target/34163] 10% performance regression since Nov 1 on Polyhedron's NF on AMD64

2008-04-22 Thread ubizjak at gmail dot com
--- Comment #4 from ubizjak at gmail dot com 2008-04-22 16:51 --- Confirmed also on core2: benchmarked with patch: 22 Apr 2008 18:47:04 gfortran - Compile nf command=gfortran -march=opteron -ffast-math -funroll-loops -ftree-loop-linear -ftree-vectorize -msse3 -O3 nf.s -o nf 22 Apr

[Bug target/34163] 10% performance regression since Nov 1 on Polyhedron's NF on AMD64

2008-04-22 Thread pinskia at gcc dot gnu dot org
--- Comment #5 from pinskia at gcc dot gnu dot org 2008-04-22 18:14 --- (In reply to comment #3) [1] http://gcc.gnu.org/viewcvs?view=revrevision=129796 It was a correctness fix, which usually will slow down generated code. :) So you have to look at the difference to make sure that

[Bug target/34163] 10% performance regression since Nov 1 on Polyhedron's NF on AMD64

2008-04-22 Thread rguenth at gcc dot gnu dot org
--- Comment #6 from rguenth at gcc dot gnu dot org 2008-04-22 22:20 --- Indeed. It would be interesting to analyze what optimization the folding enabled and see if that can be recovered somehow. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34163

[Bug target/34163] 10% performance regression since Nov 1 on Polyhedron's NF on AMD64

2008-04-21 Thread ubizjak at gmail dot com
--- Comment #1 from ubizjak at gmail dot com 2008-04-21 07:11 --- Confirmed. -- ubizjak at gmail dot com changed: What|Removed |Added Status|UNCONFIRMED

[Bug target/34163] 10% performance regression since Nov 1 on Polyhedron's NF on AMD64

2008-04-21 Thread rguenth at gcc dot gnu dot org
--- Comment #2 from rguenth at gcc dot gnu dot org 2008-04-21 09:09 --- Well, this bug needs proper analysis and a testcase, but yes, I also see this slowdown. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34163