--- Comment #23 from irar at il dot ibm dot com 2009-11-30 12:20 ---
Applied:
http://gcc.gnu.org/viewcvs?limit_changes=0&view=revision&revision=154794
Thanks,
Ira
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42108
--- Comment #22 from rguenther at suse dot de 2009-11-30 10:13 ---
Subject: Re: [4.4/4.5 Regression] Vectorizer
cannot deal with PAREN_EXPR gracefully, 50% performance regression
On Mon, 30 Nov 2009, irar at il dot ibm dot com wrote:
> --- Comment #20 from irar at il dot ibm dot
--- Comment #21 from irar at il dot ibm dot com 2009-11-30 08:54 ---
Created an attachment (id=19183)
--> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=19183&action=view)
Multiple types support patch
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42108
--- Comment #20 from irar at il dot ibm dot com 2009-11-30 08:52 ---
Actually, PAREN_EXPRs are vectorizable (the support was added by you, Richard,
in your original PAREN_EXPR patch
http://gcc.gnu.org/viewcvs?limit_changes=0&view=revision&revision=132515 )).
The problem here is that vec
--- Comment #19 from rguenth at gcc dot gnu dot org 2009-11-27 11:23
---
I guess this PR should be split further, a bug about the PAREN_EXPR wrt
vectorization and a bug about the yet unanalyzed performance regression.
--
rguenth at gcc dot gnu dot org changed:
What|R
--- Comment #18 from irar at il dot ibm dot com 2009-11-23 09:02 ---
I tried to vectorize eval.f90 with 4.3 and mainline on x86_64-suse-linux. In
both cases no loop gets vectorized in subroutine eval. The k loop is not
vectorizable because the step of x is unknown (function argument), an
--- Comment #17 from rguenth at gcc dot gnu dot org 2009-11-21 13:58
---
I have filed PR42131 for the DO loop translation issue.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42108
--- Comment #16 from rguenther at suse dot de 2009-11-21 12:19 ---
Subject: Re: [4.4/4.5 Regression] Vectorizer
cannot deal with PAREN_EXPR gracefully, 50% performance regression
On Sat, 21 Nov 2009, toon at moene dot org wrote:
> --- Comment #15 from toon at moene dot org 2009-
--- Comment #15 from toon at moene dot org 2009-11-21 12:11 ---
> I don't see that the standard suggests the specific code the Frontend
> generates. In fact it should be valid to increment the DO variable
> by m3 and express the exit test in terms of the DO variable as well.
The Standa
--- Comment #14 from rguenth at gcc dot gnu dot org 2009-11-20 23:48
---
(In reply to comment #13)
> > The funny conditional initialization of countm1.6 makes the analysis of
> > the number of iterations of this loop impossible (not to mention the
> > conversions to character(kind=4)).
--- Comment #13 from toon at moene dot org 2009-11-20 19:45 ---
> The funny conditional initialization of countm1.6 makes the analysis of
> the number of iterations of this loop impossible (not to mention the
> conversions to character(kind=4)).
> Why does the frontend do induction vari
--- Comment #12 from rguenth at gcc dot gnu dot org 2009-11-20 14:13
---
The loop is not unrolled because the frontend presents us with very funny
obfuscated code:
do k=i,nnd,n
temp=temp+(x(k)-x(k+jmini))**2
end do
gets translated to
{
character(kind=4) countm1
--- Comment #11 from sfilippone at uniroma2 dot it 2009-11-20 14:12 ---
(In reply to comment #10)
Again, I am no asking for help in writing a better code (I think I know how to
handle this, and I will convince my colleague), I just thought it was worth
mentioning that the optimizer has a
--- Comment #10 from sfilippone at uniroma2 dot it 2009-11-20 14:03 ---
(In reply to comment #9)
> I am rather confused by some comments:
>
> (1) Although I am not fluent with x86 assembly, I am pretty sure that no code
> in eval is vectorized (assembly taken from this pr or from the or
--- Comment #9 from dominiq at lps dot ens dot fr 2009-11-20 13:45 ---
I am rather confused by some comments:
(1) Although I am not fluent with x86 assembly, I am pretty sure that no code
in eval is vectorized (assembly taken from this pr or from the original post
http://gcc.gnu.org/ml/
--- Comment #8 from sfilippone at uniroma2 dot it 2009-11-20 08:32 ---
(In reply to comment #6)
> Richard Guenther wrote:
>
> > Well, within eval there's nothing really obvious to me. The
> > innermost loop is exactly the same:
>
> But it is a very inefficient way of vectorizing, beca
--- Comment #7 from anlauf at gmx dot de 2009-11-19 22:33 ---
I tried the code on a x86 Core2 system (32 bit mode).
gfortran 4.3, 4.5:
22.74user 0.03system 0:22.82elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k
Intels ifort 11.1 is only ~ 5% faster, but:
SunStudio 12.1: (sunf95 -fast
--- Comment #6 from toon at moene dot org 2009-11-19 19:53 ---
Richard Guenther wrote:
> Well, within eval there's nothing really obvious to me. The
> innermost loop is exactly the same:
But it is a very inefficient way of vectorizing, because the inner loop's body
is either executed
--- Comment #5 from sfilippone at uniroma2 dot it 2009-11-19 19:42 ---
(In reply to comment #4)
> Subject: Re: [4.4/4.5 Regression] Vectorizer
> cannot deal with PAREN_EXPR gracefully, 50% performance regression
>
>
> Heh, with -fwhole-program GCC optimizes the test away and I get 0.
--- Comment #4 from rguenther at suse dot de 2009-11-19 17:30 ---
Subject: Re: [4.4/4.5 Regression] Vectorizer
cannot deal with PAREN_EXPR gracefully, 50% performance regression
On Thu, 19 Nov 2009, sfilippone at uniroma2 dot it wrote:
> --- Comment #3 from sfilippone at uniroma2
--- Comment #3 from sfilippone at uniroma2 dot it 2009-11-19 17:17 ---
(In reply to comment #2)
> -ftree-vectorizer-verbose=2 tells you:
>
> eval.f90:35: note: not vectorized: relevant stmt not supported: D.1684_73 =
> ((D.1683_72));
>
> eval.f90:32: note: not vectorized: relevant stmt
--- Comment #2 from rguenth at gcc dot gnu dot org 2009-11-19 16:49 ---
-ftree-vectorizer-verbose=2 tells you:
eval.f90:35: note: not vectorized: relevant stmt not supported: D.1684_73 =
((D.1683_72));
eval.f90:32: note: not vectorized: relevant stmt not supported: D.1684_58 =
((D.1683
22 matches
Mail list logo