--- Comment #8 from ubizjak at gmail dot com 2008-04-25 09:55 ---
The problem is indeed in trisolve:
subroutine trisolve(x,i1,i2)
integer :: i1 , i2
real(dpkind),dimension(i2)::x
integer :: i
x(i1) = gi(i1)* x(i1)
do i = i1+1 , i2
x(i) = gi(i)*(x(i)-au1(i-1)*x(i-1))
enddo
do i =
--- Comment #9 from rguenth at gcc dot gnu dot org 2008-04-25 10:23 ---
Not hoisting the load from x(i) is a missed PRE opportunity. Complete testcase
for the second loop:
subroutine trisolve2(x,i1,i2,nxyz)
integer :: nxyz
real,dimension(nxyz):: au1
real,allocatable,dimension(:) :: gi
--- Comment #10 from ubizjak at gmail dot com 2008-04-25 11:07 ---
(In reply to comment #9)
Not hoisting the load from x(i) is a missed PRE opportunity. Complete
testcase
for the second loop:
This is actually the first loop.
Just for reference: -O2 -funroll-loops flags are
--- Comment #7 from ubizjak at gmail dot com 2008-04-24 19:56 ---
Created an attachment (id=15527)
-- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=15527action=view)
x86_64 asm dump of trisolve procedure (genereated without the patch)
All the difference is in trisolve procedure
--- Comment #3 from ubizjak at gmail dot com 2008-04-22 16:43 ---
(In reply to comment #2)
Well, this bug needs proper analysis and a testcase, but yes, I also see this
slowdown.
Richi, the only difference in generated code is by backing out your patch [1]
[1]
--- Comment #4 from ubizjak at gmail dot com 2008-04-22 16:51 ---
Confirmed also on core2:
benchmarked with patch:
22 Apr 2008 18:47:04 gfortran - Compile nf
command=gfortran -march=opteron -ffast-math -funroll-loops -ftree-loop-linear
-ftree-vectorize -msse3 -O3 nf.s -o nf
22 Apr
--- Comment #5 from pinskia at gcc dot gnu dot org 2008-04-22 18:14 ---
(In reply to comment #3)
[1] http://gcc.gnu.org/viewcvs?view=revrevision=129796
It was a correctness fix, which usually will slow down generated code. :)
So you have to look at the difference to make sure that
--- Comment #6 from rguenth at gcc dot gnu dot org 2008-04-22 22:20 ---
Indeed. It would be interesting to analyze what optimization the folding
enabled
and see if that can be recovered somehow.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34163
--- Comment #1 from ubizjak at gmail dot com 2008-04-21 07:11 ---
Confirmed.
--
ubizjak at gmail dot com changed:
What|Removed |Added
Status|UNCONFIRMED
--- Comment #2 from rguenth at gcc dot gnu dot org 2008-04-21 09:09 ---
Well, this bug needs proper analysis and a testcase, but yes, I also see this
slowdown.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34163
10 matches
Mail list logo