--- Comment #3 from irar at il dot ibm dot com 2010-09-20 13:08 ---
For vector(2) void * we get vec_perm_v2di_u builtin declaration, because the
mode of vector(2) void * is unsigned V2DI.
I wonder if this can happen for every builtin call, and we should convert back
to the original
--- Comment #2 from irar at il dot ibm dot com 2010-09-20 12:17 ---
Looks like it is caused by revision 164367:
http://gcc.gnu.org/ml/gcc-cvs/2010-09/msg00661.html
--
irar at il dot ibm dot com changed:
What|Removed |Added
--- Comment #7 from irar at il dot ibm dot com 2010-09-20 06:43 ---
Fixed.
--
irar at il dot ibm dot com changed:
What|Removed |Added
Status|NEW
--- Comment #5 from irar at il dot ibm dot com 2010-09-19 10:08 ---
Right. This patch fixes it:
Index: tree-vect-stmts.c
===
--- tree-vect-stmts.c (revision 164332)
+++ tree-vect-stmts.c (working copy)
@@ -4478,6
--- Comment #3 from irar at il dot ibm dot com 2010-09-19 08:52 ---
gimple_bb (stmt) returns NULL for that statement (D.1575_33 = __builtin_pow
(D.1542_14, D.1574_32)).
We can avoid vectorization in such cases, but looks like it should be fixed to
return the actual basic block.
Ira
--- Comment #9 from irar at il dot ibm dot com 2010-09-12 09:46 ---
OK, thanks. I am going to test this patch, it only checks data-refs and
function calls:
Index: tree-vect-data-refs.c
===
--- tree-vect-data-refs.c
--- Comment #6 from irar at il dot ibm dot com 2010-09-01 11:54 ---
(In reply to comment #5)
> I see before SLP:
>
> :
> MEM[(struct A *)this_1(D)].a = 0;
> MEM[(struct A *)this_1(D)].b = 0;
> MEM[(struct A *)this_1(D)].c = 0;
> [LP 2] MEM[(struct A *)t
--- Comment #3 from irar at il dot ibm dot com 2010-09-01 09:06 ---
r163260 only made this BB vectorizable.
I checked lookup_stmt_eh_lp for the last stmt of the BB and EDGE_EH flags
before and after vectorization (basic block SLP), and in both cases
lookup_stmt_eh_lp returns 0 and
--- Comment #7 from irar at il dot ibm dot com 2010-08-11 10:24 ---
(In reply to comment #6)
> I think that SLP doesn't handle reduction.
>
Not all kinds of reduction. We handle
#a1 = phi
#b1 = phi
...
a2 = a1 + x
b2 = b1 + y
Here we also have:
#a1 = phi
...
a2 = a1
--- Comment #5 from irar at il dot ibm dot com 2010-08-10 10:23 ---
(In reply to comment #1)
> This patch should be a valid fix, because the recognition of the dot_prod
> pattern is known to be fail at this point if the stmt is outside the loop.
> (I am not sure whether we s
--- Comment #4 from irar at il dot ibm dot com 2010-08-10 09:06 ---
I am testing the same patch as in comment #1.
Testcase that shows the problem:
int
foo(short x)
{
short i, y;
int sum;
for (i = 0; i < x; i++)
y = x * i;
for (i = x; i > 0; i--)
sum += y;
--- Comment #4 from irar at il dot ibm dot com 2010-07-27 09:25 ---
I am testing a patch.
--
irar at il dot ibm dot com changed:
What|Removed |Added
AssignedTo
--- Comment #1 from irar at il dot ibm dot com 2010-07-08 09:14 ---
The failure is in vectorizable_store():
/* If accesses through a pointer to vectype do not alias the original
memory reference we have a problem. This should never happen
--- Comment #1 from irar at il dot ibm dot com 2010-06-29 11:00 ---
Created an attachment (id=21037)
--> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=21037&action=view)
Full testcase
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44711
Component: tree-optimization
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: irar at il dot ibm dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44711
--- Comment #1 from irar at il dot ibm dot com 2010-06-29 09:11 ---
Created an attachment (id=21036)
--> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=21036&action=view)
Full testcase
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44710
riority: P3
Component: tree-optimization
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: irar at il dot ibm dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44710
--- Comment #7 from irar at il dot ibm dot com 2010-06-13 12:01 ---
(In reply to comment #6)
> (In reply to comment #5)
> > The bug is in creation of a neutral value for BIT_AND_EXPR. What is the
> > correct
> > way to create it for all types? I found
> > do
--- Comment #5 from irar at il dot ibm dot com 2010-06-13 10:29 ---
The bug is in creation of a neutral value for BIT_AND_EXPR. What is the correct
way to create it for all types? I found
double-int.h:#define ALL_ONES (~((unsigned HOST_WIDE_INT) 0))
but it won't work for s
--- Comment #5 from irar at il dot ibm dot com 2010-05-20 10:24 ---
Even if we are talking about less than vector size from array boundary? And
that boundary is not (vector) aligned.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44183
--- Comment #3 from irar at il dot ibm dot com 2010-05-20 10:04 ---
I am curious what is the problem with that? These elements are not used, they
are just loaded...
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44183
--- Comment #1 from irar at il dot ibm dot com 2010-05-20 07:13 ---
Do you mean that extract_even implementation does something illegal with this
last element? Misaligned load also accesses elements outside the array, but the
problem is in extract_even?
Other than doing something in
--- Comment #16 from irar at il dot ibm dot com 2010-05-10 08:17 ---
Fixed.
--
irar at il dot ibm dot com changed:
What|Removed |Added
Status|ASSIGNED
--- Comment #14 from irar at il dot ibm dot com 2010-05-05 09:02 ---
> It tries to get a _vector_ type of the same size. In theory each
> vectorization method can choose whatever vector size suits them
> most (as for external defs they need to build up a vector of e
--- Comment #12 from irar at il dot ibm dot com 2010-05-03 12:30 ---
> Well. For loops we'd have disqualified it as there is no vector
> type for the external def (well, the stmt inside the loop).
I don't think that's true. With -fno-tree-pre we get
--- Comment #10 from irar at il dot ibm dot com 2010-05-02 12:12 ---
Looks like it's caused by:
r158157 | rguenth | 2010-04-09 13:40:14 +0300 (Fri, 09 Apr 2010) | 28 lines
The problem is in getting vectype for f1_2:
foo (int b, double f1, double f2, int c1, int c2)
{
...
fl
--- Comment #9 from irar at il dot ibm dot com 2010-05-02 11:08 ---
Thanks, Uros! I reproduced the ICE using your instructions.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43901
--- Comment #4 from irar at il dot ibm dot com 2010-05-02 05:51 ---
I don't have access to ia64. I tried to change the types in the test to make
the basic blocks vectorizable on x86_64, but didn't get any error. So I still
need SLP dump in order to solve this.
Thanks,
Ira
--- Comment #1 from irar at il dot ibm dot com 2010-04-27 05:53 ---
Could you please give some more information? It doesn't fail on x86_64-linux.
(For SLP dump please use -fdump-tree-slp-details).
Thanks,
Ira
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43901
--- Comment #6 from irar at il dot ibm dot com 2010-04-22 18:11 ---
Yes, sorry about that. I updated the ChangeLogs.
Ira
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43482
--
irar at il dot ibm dot com changed:
What|Removed |Added
AssignedTo|unassigned at gcc dot gnu |irar at il dot ibm dot com
|dot org
--- Comment #10 from irar at il dot ibm dot com 2010-04-21 18:33 ---
Thanks. So, it is not always profitable and requires a cost model.
I am now working on cost model for basic block vectorization, I can look at
this once we have one.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id
--- Comment #8 from irar at il dot ibm dot com 2010-04-21 11:33 ---
Yes, it's possible to add this to SLP. But I don't understand how
D.3154_3 = COMPLEX_EXPR ;
should be vectorized. D.3154_3 is complex and the rhs will be a vector
{D.3163_8, D.3164_9} (btw, we have to chang
--- Comment #5 from irar at il dot ibm dot com 2010-04-19 14:35 ---
Fixed.
--
irar at il dot ibm dot com changed:
What|Removed |Added
Status|NEW
--- Comment #7 from irar at il dot ibm dot com 2010-04-19 07:48 ---
Fixed on 4.6, 4.5 and 4.4.
--
irar at il dot ibm dot com changed:
What|Removed |Added
--
irar at il dot ibm dot com changed:
What|Removed |Added
AssignedTo|unassigned at gcc dot gnu |irar at il dot ibm dot com
|dot org
--- Comment #5 from irar at il dot ibm dot com 2010-04-08 17:59 ---
In GCC 4.4 the smaller loop gets completely unrolled before the vectorizer.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43692
--- Comment #3 from irar at il dot ibm dot com 2010-04-08 17:33 ---
Both loops get vectorized for me with -O3 on x86_64-suse-linux.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43692
--- Comment #1 from irar at il dot ibm dot com 2010-04-08 17:14 ---
It probably happens because the vectorization is not profitable. Try
-fno-vect-cost-model flag.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43692
--- Comment #7 from irar at il dot ibm dot com 2010-03-28 18:22 ---
(In reply to comment #5)
> When defining the missing function like this:
>
> static inline int mid_pred(int a, int b, int c)
> {
> int t= (a-b)&((a-b)>>31);
> a-=t;
> b
--- Comment #6 from irar at il dot ibm dot com 2010-03-28 18:05 ---
(In reply to comment #4)
> What about fixing the diagnostic message like this:
>
It would be nice to do the same for SLP (compute_data_dependences_for_bb) for
completeness.
Thanks,
Ira
> diff --git a/gcc/
--- Comment #1 from irar at il dot ibm dot com 2010-03-28 11:16 ---
Looks similar to PR 32806.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43543
--- Comment #3 from irar at il dot ibm dot com 2010-03-28 11:07 ---
(In reply to comment #1)
> hadamard8_diff.c:44: note: not vectorized: unhandled data-ref
There is a function call in this loop as well.
> hadamard8_diff.c:26: note: not vectorized: data ref analysis failed D.2
--- Comment #2 from irar at il dot ibm dot com 2010-03-28 10:58 ---
(In reply to comment #0)
> sub_hfyu_median_prediction.c:18: note: not vectorized: unhandled data-ref
>
> Looking with GDB at it, I get:
> (gdb) p debug_data_references (datarefs)
> (Data Ref:
>
--- Comment #1 from irar at il dot ibm dot com 2010-03-28 09:41 ---
(In reply to comment #0)
> What does this message mean?
> "vector iteration cost = 2056 is divisible by scalar iteration cost = 4 by a
> factor greater than or equal to the vectorization factor =
--- Comment #2 from irar at il dot ibm dot com 2010-03-28 08:59 ---
I think PR 35229 covers this issue.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43425
--- Comment #17 from irar at il dot ibm dot com 2010-02-22 09:01 ---
Is there a way to pass alignment information similar to PR 39954?
Otherwise, a proper fix would be some inter-procedural analysis... Meantime, we
can do intra-procedural analysis and fail when we reach function
--
irar at il dot ibm dot com changed:
What|Removed |Added
AssignedTo|unassigned at gcc dot gnu |irar at il dot ibm dot com
|dot org
--- Comment #3 from irar at il dot ibm dot com 2010-01-24 07:39 ---
This has already been discussed in PR 41464.
Ira
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42846
--- Comment #13 from irar at il dot ibm dot com 2010-01-18 12:17 ---
Does something like this make sense? (With this patch we will never use peeling
for function parameters, unless the builtin returns OK to peel for packed
types).
Index: tree-vect-data-refs.c
--
irar at il dot ibm dot com changed:
What|Removed |Added
AssignedTo|unassigned at gcc dot gnu |irar at il dot ibm dot com
|dot org
--- Comment #10 from irar at il dot ibm dot com 2010-01-13 09:35 ---
Yes, I understand that we can't assume that an access is aligned if we can't
prove it's aligned. I don't understand how we can prove that a COMPONENT_REF is
aligned, i.e., if there is a way to
--- Comment #8 from irar at il dot ibm dot com 2010-01-12 08:08 ---
So, to be on the safe side, we should assume that COMPONENT_REFs are not
naturally aligned and never use peeling for them?
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42652
--- Comment #43 from irar at il dot ibm dot com 2010-01-10 13:43 ---
Since -O2 -ftree-vectorize doesn't cause bad code, it has to be some other
optimization on top of vectorized code that causes the problem.
Bad code is generated when the alignment of 'reduce' is
--- Comment #5 from irar at il dot ibm dot com 2010-01-10 08:22 ---
In vector_alignment_reachable_p() we check if an access is packed using
contains_packed_reference(). For packed accesses we return false, meaning
alignment is unreachable and peeling cannot be used.
In the attached
--- Comment #42 from irar at il dot ibm dot com 2010-01-05 09:09 ---
So, it's enough to force alignment of reduce only (and to vectorize its loop)
to get wrong code. On the other hand, the result of the vectorized loop is
correct, and the problem is in choosing the correct index of
--- Comment #7 from irar at il dot ibm dot com 2009-12-30 10:16 ---
The bug is in SLP load permutation analysis. I am testing a patch.
--
irar at il dot ibm dot com changed:
What|Removed |Added
--- Comment #40 from irar at il dot ibm dot com 2009-12-23 14:49 ---
(In reply to comment #39)
> I have regtested the patch in comment #31 and I have ~75 regressions on
> x86_64-apple-darwin10 in the gcc vect test suite (~100 on
> powerpc-apple-darwin9). Is this expected? a
--- Comment #38 from irar at il dot ibm dot com 2009-12-23 07:55 ---
Created an attachment (id=19378)
--> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=19378&action=view)
Force alignment of reduce only
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=41082
--- Comment #37 from irar at il dot ibm dot com 2009-12-23 07:54 ---
Created an attachment (id=19377)
--> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=19377&action=view)
Force alignment but don't vectorize
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=41082
--- Comment #36 from irar at il dot ibm dot com 2009-12-23 07:54 ---
Thanks!
So, it is alignment of the vectorized arrays. I'd like to do two more checks:
1. Just force alignment of the two arrays (temp and reduce) and do not
vectorize.
2. Force alignment of reduce only (and vect
--- Comment #32 from irar at il dot ibm dot com 2009-12-22 11:44 ---
Created an attachment (id=19371)
--> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=19371&action=view)
force alignment of vectorized arrays only
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=41082
--- Comment #31 from irar at il dot ibm dot com 2009-12-22 11:43 ---
Created an attachment (id=19370)
--> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=19370&action=view)
disable alignment forcing
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=41082
--- Comment #30 from irar at il dot ibm dot com 2009-12-22 11:42 ---
We can try to verify the alignment issue by applying the two hacks I am
attaching.
The first one disables alignment forcing for all the data-refs (and marks the
alignment as unknown). The loops are still vectorizable
--- Comment #28 from irar at il dot ibm dot com 2009-12-20 13:59 ---
Hm, I don't know, but this is my best guess - we change something in the code
that goes wrong...
We also force alignment of reduce, but the reduction computation looks ok.
--
http://gcc.gnu.org/bug
--- Comment #26 from irar at il dot ibm dot com 2009-12-20 13:46 ---
I think the problem is in alignment. We force alignment of temp.6 and temp.20 -
the arrays of relevant comaprison results - even though we don't vectorize
their loop. The decision whether we can force alignment is
--- Comment #23 from irar at il dot ibm dot com 2009-12-20 12:18 ---
The code that now gets vectorized is the summation of array 'reduce':
sum(reduce). It looks like the problem is with adding the reduction result to
the correct index of 'temp' (scalar code), and no
--- Comment #21 from irar at il dot ibm dot com 2009-12-16 12:01 ---
Thanks.
I'll be able to look at this only on Sunday due to holidays.
Ira
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=41082
--- Comment #16 from irar at il dot ibm dot com 2009-12-15 13:35 ---
But in comment #5 you wrote that it passes with the print, right? So, this dump
contains correct or incorrect code?
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=41082
--- Comment #14 from irar at il dot ibm dot com 2009-12-15 13:08 ---
Created an attachment (id=19311)
--> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=19311&action=view)
powerpc64-suse-linux vect dump
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=41082
--- Comment #13 from irar at il dot ibm dot com 2009-12-15 13:07 ---
(In reply to comment #12)
> > Looks that it has to be my patch that enables vectorization of conditions:
> I am doing a clean bootstrap of C and FORTRAN of revision 149805 to see if the
> test works for i
--- Comment #11 from irar at il dot ibm dot com 2009-12-15 10:59 ---
Looks that it has to be my patch that enables vectorization of conditions:
r149806 | irar | 2009-07-20 14:59:10 +0300 (Mon, 20 Jul 2009) | 19 lines
* tree-vectorizer.h (vectorizable_condition): Add
--- Comment #7 from irar at il dot ibm dot com 2009-12-15 08:25 ---
I can't reproduce it with current mainline on powerpc64-suse-linux. Could you
please attach vectorizer dump? Does the good old version gets vectorized? If
so, could you please attach it as well?
Thanks
--- Comment #3 from irar at il dot ibm dot com 2009-12-06 13:25 ---
On powerpc64-suse-linux with current trunk calculix failed after a couple of
minutes with
-O3 -maltivec -ffast-math
-O3 -maltivec -ffast-math -fno-tree-vectorize
-O2 -maltivec -ffast-math
-O1 -maltivec -ffast-math
--- Comment #23 from irar at il dot ibm dot com 2009-11-30 12:20 ---
Applied:
http://gcc.gnu.org/viewcvs?limit_changes=0&view=revision&revision=154794
Thanks,
Ira
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42108
--- Comment #21 from irar at il dot ibm dot com 2009-11-30 08:54 ---
Created an attachment (id=19183)
--> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=19183&action=view)
Multiple types support patch
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42108
--- Comment #20 from irar at il dot ibm dot com 2009-11-30 08:52 ---
Actually, PAREN_EXPRs are vectorizable (the support was added by you, Richard,
in your original PAREN_EXPR patch
http://gcc.gnu.org/viewcvs?limit_changes=0&view=revision&revision=132515 )).
The problem here
--
irar at il dot ibm dot com changed:
What|Removed |Added
AssignedTo|unassigned at gcc dot gnu |irar at il dot ibm dot com
|dot org
--- Comment #18 from irar at il dot ibm dot com 2009-11-23 09:02 ---
I tried to vectorize eval.f90 with 4.3 and mainline on x86_64-suse-linux. In
both cases no loop gets vectorized in subroutine eval. The k loop is not
vectorizable because the step of x is unknown (function argument
--- Comment #5 from irar at il dot ibm dot com 2009-11-12 07:51 ---
(In reply to comment #4)
> I didn't check yet. We'll work on a simple cost-model integration of
> predcom.
You mean, vectorizer cost model will take predcom into account?
If the vectorization is no
--- Comment #3 from irar at il dot ibm dot com 2009-11-10 10:02 ---
(In reply to comment #0)
> This causes mgrid score to drop
> by almost 40% on x86_64 and the vectorized code is pretty bad because it
> uses unaligned accesses.
Is the vectorized code worse than the scalar
--- Comment #6 from irar at il dot ibm dot com 2009-09-27 09:56 ---
(In reply to comment #5)
> >
> > "aligned to" refers to the offset misalignment and not to the misalignment
> > of
> > base.
> Hmm, I believe it refers to base + offse
--- Comment #4 from irar at il dot ibm dot com 2009-09-27 08:06 ---
(In reply to comment #1)
> The interesting thing is that data-ref analysis sees 128bit alignment but
> the vectorizer still produces
> vect_var_.24_59 = M*vect_p.20_57{misalignment: 0};
> D.2564_12
--- Comment #9 from irar at il dot ibm dot com 2009-09-08 05:51 ---
Looks related to PR 39907.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=41288
--- Comment #12 from irar at il dot ibm dot com 2009-08-13 11:37 ---
Created an attachment (id=18351)
--> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=18351&action=view)
The assembly for the int version (correct)
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=41019
--- Comment #11 from irar at il dot ibm dot com 2009-08-13 11:36 ---
Created an attachment (id=18350)
--> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=18350&action=view)
The assembly for the long int version (wrong code)
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=41019
--- Comment #10 from irar at il dot ibm dot com 2009-08-13 11:34 ---
Reduced testcase:
#include
#include
#define N 4
long int a[N];
int main ()
{
int k;
for (k = 0; k < N; ++k)
a[k] = a[k] != 5 ? 12 : 10;
for (k = 0; k < N; ++k)
printf ("%u ", a[k])
--- Comment #8 from irar at il dot ibm dot com 2009-08-13 05:40 ---
(In reply to comment #7)
> Oh. Did you manage to reduce or reproduce with a smaller testcase?
No, I just looked at the vectorized loops. The guilty one is
bin/../lib/gcc/x86_64-unknown-linux-gnu/4.
--- Comment #6 from irar at il dot ibm dot com 2009-08-12 12:14 ---
Looks like a problem in data-ref analysis:
Creating dr for this_6(D)->_M_x[__k_87]
...
base_address: this_6(D)
offset from base address: 0
constant offset from base address: 0
step
--- Comment #3 from irar at il dot ibm dot com 2009-08-09 12:15 ---
Fixed.
--
irar at il dot ibm dot com changed:
What|Removed |Added
Status|UNCONFIRMED
--- Comment #10 from irar at il dot ibm dot com 2009-08-06 10:49 ---
Yes. The problem is that only a basic implementation was added. To vectorize
this code several improvements must be done: support stmt group sizes greater
than vector size, allow loads and stores to the same location
--- Comment #41 from irar at il dot ibm dot com 2009-07-28 08:12 ---
That requires pattern recognition. MIN/MAX_EXPR are recognized by the first
phiopt pass, so MIN/MAXLOC should be either also recognized there or in the
vectorizer. (The phiopt pass transforms if clause to MIN/MAX_EXPR
--- Comment #38 from irar at il dot ibm dot com 2009-07-27 12:44 ---
I am not sure that that kind of computation can be generated automatically,
since in general the order of caclulation of cond_expr cannot be changed.
However, the loop can be split:
for (i = 0; i < end
--- Comment #34 from irar at il dot ibm dot com 2009-07-27 08:36 ---
(In reply to comment #33)
> Using the example from comment 23 with
...
> gfortran shows: test.f90:12: note: not vectorized: unsupported use in stmt.
> and needs 2.272s. (By comparison. 4.4 needs 3.688s.)
Th
--- Comment #32 from irar at il dot ibm dot com 2009-07-26 07:48 ---
(In reply to comment #30)
> Regarding the just committed inline version: It would be interesting to know
> whether it is vectorizable (with/without -ffinite-math-only [i.e.
> -ffast-math]).
It depends on wh
--- Comment #5 from irar at il dot ibm dot com 2009-07-26 07:04 ---
Fixed.
--
irar at il dot ibm dot com changed:
What|Removed |Added
Status|ASSIGNED
--- Comment #28 from irar at il dot ibm dot com 2009-07-20 12:03 ---
I've just committed a patch that adds support of cond_expr in reductions in
nested cycles (http://gcc.gnu.org/ml/gcc-patches/2009-07/msg01124.html).
cond_expr cannot be vectorized in reduction of inner-most
--- Comment #7 from irar at il dot ibm dot com 2009-07-20 11:18 ---
AFAIU, querying for the component type of complex type is not difficult to
implement.
I think, that loop-based vectorization is preferable here, so we should stay
with vectorization factor of 2 for doubles.
The next
--- Comment #3 from irar at il dot ibm dot com 2009-07-19 09:35 ---
Testing a fix.
Ira
--
irar at il dot ibm dot com changed:
What|Removed |Added
AssignedTo
--- Comment #6 from irar at il dot ibm dot com 2009-07-16 17:31 ---
(In reply to comment #3)
> > make_vector_type returns NULL for this type.
> Yes - there is no vector type for complex double. But the vectorizer
> could query for a vector type for the complex component
1 - 100 of 370 matches
Mail list logo