[Bug tree-optimization/45714] [4.6 Regression] Vectorization of double pow function causes a segmentation fault

2010-09-20 Thread irar at il dot ibm dot com
--- Comment #7 from irar at il dot ibm dot com 2010-09-20 06:43 --- Fixed. -- irar at il dot ibm dot com changed: What|Removed |Added Status|NEW

[Bug tree-optimization/45733] [4.6 Regression] ICE: verify_stmts failed: invalid conversion in gimple call with -fstrict-overflow -ftree-vectorize

2010-09-20 Thread irar at il dot ibm dot com
--- Comment #2 from irar at il dot ibm dot com 2010-09-20 12:17 --- Looks like it is caused by revision 164367: http://gcc.gnu.org/ml/gcc-cvs/2010-09/msg00661.html -- irar at il dot ibm dot com changed: What|Removed |Added

[Bug tree-optimization/45733] [4.6 Regression] ICE: verify_stmts failed: invalid conversion in gimple call with -fstrict-overflow -ftree-vectorize

2010-09-20 Thread irar at il dot ibm dot com
--- Comment #3 from irar at il dot ibm dot com 2010-09-20 13:08 --- For vector(2) void * we get vec_perm_v2di_u builtin declaration, because the mode of vector(2) void * is unsigned V2DI. I wonder if this can happen for every builtin call, and we should convert back to the original

[Bug tree-optimization/45714] [4.6 Regression] Vectorization of double pow function causes a segmentation fault

2010-09-19 Thread irar at il dot ibm dot com
--- Comment #3 from irar at il dot ibm dot com 2010-09-19 08:52 --- gimple_bb (stmt) returns NULL for that statement (D.1575_33 = __builtin_pow (D.1542_14, D.1574_32)). We can avoid vectorization in such cases, but looks like it should be fixed to return the actual basic block. Ira

[Bug tree-optimization/45714] [4.6 Regression] Vectorization of double pow function causes a segmentation fault

2010-09-19 Thread irar at il dot ibm dot com
--- Comment #5 from irar at il dot ibm dot com 2010-09-19 10:08 --- Right. This patch fixes it: Index: tree-vect-stmts.c === --- tree-vect-stmts.c (revision 164332) +++ tree-vect-stmts.c (working copy) @@ -4478,6

[Bug tree-optimization/45470] [4.6 Regression] ICE: verify_flow_info failed: BB 2 can not throw but has an EH edge with -ftree-vectorize -fnon-call-exceptions

2010-09-12 Thread irar at il dot ibm dot com
--- Comment #9 from irar at il dot ibm dot com 2010-09-12 09:46 --- OK, thanks. I am going to test this patch, it only checks data-refs and function calls: Index: tree-vect-data-refs.c === --- tree-vect-data-refs.c

[Bug tree-optimization/45470] [4.6 Regression] ICE: verify_flow_info failed: BB 2 can not throw but has an EH edge with -ftree-vectorize -fnon-call-exceptions

2010-09-01 Thread irar at il dot ibm dot com
--- Comment #3 from irar at il dot ibm dot com 2010-09-01 09:06 --- r163260 only made this BB vectorizable. I checked lookup_stmt_eh_lp for the last stmt of the BB and EDGE_EH flags before and after vectorization (basic block SLP), and in both cases lookup_stmt_eh_lp returns 0

[Bug tree-optimization/45470] [4.6 Regression] ICE: verify_flow_info failed: BB 2 can not throw but has an EH edge with -ftree-vectorize -fnon-call-exceptions

2010-09-01 Thread irar at il dot ibm dot com
--- Comment #6 from irar at il dot ibm dot com 2010-09-01 11:54 --- (In reply to comment #5) I see before SLP: bb 2: MEM[(struct A *)this_1(D)].a = 0; MEM[(struct A *)this_1(D)].b = 0; MEM[(struct A *)this_1(D)].c = 0; [LP 2] MEM[(struct A *)this_1(D) + 12B].a = 0

[Bug tree-optimization/41881] [4.5/4.6 regression] Complete unrolling (inner) versus vectorization of reduction

2010-08-11 Thread irar at il dot ibm dot com
--- Comment #7 from irar at il dot ibm dot com 2010-08-11 10:24 --- (In reply to comment #6) I think that SLP doesn't handle reduction. Not all kinds of reduction. We handle #a1 = phi a0, a2 #b1 = phi b0, b2 ... a2 = a1 + x b2 = b1 + y Here we also have: #a1 = phi a0, a9 ... a2

[Bug tree-optimization/45241] CPU2006 465.tonto ICE in the vectorizer with -fno-tree-pre

2010-08-10 Thread irar at il dot ibm dot com
--- Comment #4 from irar at il dot ibm dot com 2010-08-10 09:06 --- I am testing the same patch as in comment #1. Testcase that shows the problem: int foo(short x) { short i, y; int sum; for (i = 0; i x; i++) y = x * i; for (i = x; i 0; i--) sum += y; return sum

[Bug tree-optimization/45241] CPU2006 465.tonto ICE in the vectorizer with -fno-tree-pre

2010-08-10 Thread irar at il dot ibm dot com
--- Comment #5 from irar at il dot ibm dot com 2010-08-10 10:23 --- (In reply to comment #1) This patch should be a valid fix, because the recognition of the dot_prod pattern is known to be fail at this point if the stmt is outside the loop. (I am not sure whether we should not see

[Bug lto/44152] ICE on compiling xshow.f of xplor-nih with -O3 -ffast-math -fwhopr

2010-07-27 Thread irar at il dot ibm dot com
--- Comment #4 from irar at il dot ibm dot com 2010-07-27 09:25 --- I am testing a patch. -- irar at il dot ibm dot com changed: What|Removed |Added AssignedTo

[Bug tree-optimization/44861] internal compiler error: in vectorizable_load, at tree-vect-stmts.c:3812

2010-07-08 Thread irar at il dot ibm dot com
--- Comment #1 from irar at il dot ibm dot com 2010-07-08 09:14 --- The failure is in vectorizable_store(): /* If accesses through a pointer to vectype do not alias the original memory reference we have a problem. This should never happen

[Bug tree-optimization/44710] New: If-conversion generates redundant statements

2010-06-29 Thread irar at il dot ibm dot com
: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: irar at il dot ibm dot com http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44710

[Bug tree-optimization/44710] If-conversion generates redundant statements

2010-06-29 Thread irar at il dot ibm dot com
--- Comment #1 from irar at il dot ibm dot com 2010-06-29 09:11 --- Created an attachment (id=21036) -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=21036action=view) Full testcase -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44710

[Bug tree-optimization/44711] New: PRE doesn't remove equivalent computations of induction variables

2010-06-29 Thread irar at il dot ibm dot com
Version: 4.6.0 Status: UNCONFIRMED Keywords: missed-optimization Severity: normal Priority: P3 Component: tree-optimization AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: irar at il dot ibm dot com http://gcc.gnu.org/bugzilla

[Bug tree-optimization/44711] PRE doesn't remove equivalent computations of induction variables

2010-06-29 Thread irar at il dot ibm dot com
--- Comment #1 from irar at il dot ibm dot com 2010-06-29 11:00 --- Created an attachment (id=21037) -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=21037action=view) Full testcase -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44711

[Bug tree-optimization/44507] [4.5/4.6 Regression] vectorization ANDs array elements together incorrectly

2010-06-13 Thread irar at il dot ibm dot com
--- Comment #5 from irar at il dot ibm dot com 2010-06-13 10:29 --- The bug is in creation of a neutral value for BIT_AND_EXPR. What is the correct way to create it for all types? I found double-int.h:#define ALL_ONES (~((unsigned HOST_WIDE_INT) 0)) but it won't work for signed

[Bug tree-optimization/44507] [4.5/4.6 Regression] vectorization ANDs array elements together incorrectly

2010-06-13 Thread irar at il dot ibm dot com
--- Comment #7 from irar at il dot ibm dot com 2010-06-13 12:01 --- (In reply to comment #6) (In reply to comment #5) The bug is in creation of a neutral value for BIT_AND_EXPR. What is the correct way to create it for all types? I found double-int.h:#define ALL_ONES

[Bug tree-optimization/44183] Vectorizer may generate invalid memory access

2010-05-20 Thread irar at il dot ibm dot com
--- Comment #1 from irar at il dot ibm dot com 2010-05-20 07:13 --- Do you mean that extract_even implementation does something illegal with this last element? Misaligned load also accesses elements outside the array, but the problem is in extract_even? Other than doing something

[Bug tree-optimization/44183] Vectorizer may generate invalid memory access

2010-05-20 Thread irar at il dot ibm dot com
--- Comment #3 from irar at il dot ibm dot com 2010-05-20 10:04 --- I am curious what is the problem with that? These elements are not used, they are just loaded... -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44183

[Bug tree-optimization/44183] Vectorizer may generate invalid memory access

2010-05-20 Thread irar at il dot ibm dot com
--- Comment #5 from irar at il dot ibm dot com 2010-05-20 10:24 --- Even if we are talking about less than vector size from array boundary? And that boundary is not (vector) aligned. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44183

[Bug middle-end/43901] [4.6 Regression] FAIL: gcc.c-torture/compile/pr42196-2.c

2010-05-10 Thread irar at il dot ibm dot com
--- Comment #16 from irar at il dot ibm dot com 2010-05-10 08:17 --- Fixed. -- irar at il dot ibm dot com changed: What|Removed |Added Status|ASSIGNED

[Bug middle-end/43901] [4.6 Regression] FAIL: gcc.c-torture/compile/pr42196-2.c

2010-05-05 Thread irar at il dot ibm dot com
--- Comment #14 from irar at il dot ibm dot com 2010-05-05 09:02 --- It tries to get a _vector_ type of the same size. In theory each vectorization method can choose whatever vector size suits them most (as for external defs they need to build up a vector of equivalent elements

[Bug middle-end/43901] [4.6 Regression] FAIL: gcc.c-torture/compile/pr42196-2.c

2010-05-03 Thread irar at il dot ibm dot com
--- Comment #12 from irar at il dot ibm dot com 2010-05-03 12:30 --- Well. For loops we'd have disqualified it as there is no vector type for the external def (well, the stmt inside the loop). I don't think that's true. With -fno-tree-pre we get the same ICE for loop vectorization

[Bug middle-end/43901] [4.6 Regression] FAIL: gcc.c-torture/compile/pr42196-2.c

2010-05-02 Thread irar at il dot ibm dot com
--- Comment #9 from irar at il dot ibm dot com 2010-05-02 11:08 --- Thanks, Uros! I reproduced the ICE using your instructions. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43901

[Bug middle-end/43901] [4.6 Regression] FAIL: gcc.c-torture/compile/pr42196-2.c

2010-05-02 Thread irar at il dot ibm dot com
--- Comment #10 from irar at il dot ibm dot com 2010-05-02 12:12 --- Looks like it's caused by: r158157 | rguenth | 2010-04-09 13:40:14 +0300 (Fri, 09 Apr 2010) | 28 lines The problem is in getting vectype for f1_2: foo (int b, double f1, double f2, int c1, int c2) { ... float D

[Bug middle-end/43901] [4.6 Regression] FAIL: gcc.c-torture/compile/pr42196-2.c

2010-05-01 Thread irar at il dot ibm dot com
--- Comment #4 from irar at il dot ibm dot com 2010-05-02 05:51 --- I don't have access to ia64. I tried to change the types in the test to make the basic blocks vectorizable on x86_64, but didn't get any error. So I still need SLP dump in order to solve this. Thanks, Ira -- irar

[Bug middle-end/43901] [4.6 Regression] FAIL: gcc.c-torture/compile/pr42196-2.c

2010-04-26 Thread irar at il dot ibm dot com
--- Comment #1 from irar at il dot ibm dot com 2010-04-27 05:53 --- Could you please give some more information? It doesn't fail on x86_64-linux. (For SLP dump please use -fdump-tree-slp-details). Thanks, Ira -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43901

[Bug tree-optimization/43842] [4.6 Regression] ice in vect_create_epilog_for_reduction

2010-04-22 Thread irar at il dot ibm dot com
-- irar at il dot ibm dot com changed: What|Removed |Added AssignedTo|unassigned at gcc dot gnu |irar at il dot ibm dot com |dot org

[Bug testsuite/43482] Fix *.log tests merged output containing ===

2010-04-22 Thread irar at il dot ibm dot com
--- Comment #6 from irar at il dot ibm dot com 2010-04-22 18:11 --- Yes, sorry about that. I updated the ChangeLogs. Ira -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43482

[Bug rtl-optimization/31485] C complex numbers, amd64 SSE, missed optimization opportunity

2010-04-21 Thread irar at il dot ibm dot com
--- Comment #8 from irar at il dot ibm dot com 2010-04-21 11:33 --- Yes, it's possible to add this to SLP. But I don't understand how D.3154_3 = COMPLEX_EXPR D.3163_8, D.3164_9; should be vectorized. D.3154_3 is complex and the rhs will be a vector {D.3163_8, D.3164_9} (btw, we have

[Bug rtl-optimization/31485] C complex numbers, amd64 SSE, missed optimization opportunity

2010-04-21 Thread irar at il dot ibm dot com
--- Comment #10 from irar at il dot ibm dot com 2010-04-21 18:33 --- Thanks. So, it is not always profitable and requires a cost model. I am now working on cost model for basic block vectorization, I can look at this once we have one. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id

[Bug tree-optimization/43771] [4.5/4.6 Regression] ICE on valid when compiling ParMetis with gcc 4.5.0 and -O3

2010-04-19 Thread irar at il dot ibm dot com
--- Comment #7 from irar at il dot ibm dot com 2010-04-19 07:48 --- Fixed on 4.6, 4.5 and 4.4. -- irar at il dot ibm dot com changed: What|Removed |Added

[Bug tree-optimization/37027] SLP loop vectorization missing support for reductions

2010-04-19 Thread irar at il dot ibm dot com
--- Comment #5 from irar at il dot ibm dot com 2010-04-19 14:35 --- Fixed. -- irar at il dot ibm dot com changed: What|Removed |Added Status|NEW

[Bug tree-optimization/43771] [4.5/4.6 Regression] ICE on valid when compiling ParMetis with gcc 4.5.0 and -O3

2010-04-18 Thread irar at il dot ibm dot com
-- irar at il dot ibm dot com changed: What|Removed |Added AssignedTo|unassigned at gcc dot gnu |irar at il dot ibm dot com |dot org

[Bug tree-optimization/43692] small loop not vectorized

2010-04-08 Thread irar at il dot ibm dot com
--- Comment #1 from irar at il dot ibm dot com 2010-04-08 17:14 --- It probably happens because the vectorization is not profitable. Try -fno-vect-cost-model flag. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43692

[Bug tree-optimization/43692] small loop not vectorized

2010-04-08 Thread irar at il dot ibm dot com
--- Comment #3 from irar at il dot ibm dot com 2010-04-08 17:33 --- Both loops get vectorized for me with -O3 on x86_64-suse-linux. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43692

[Bug tree-optimization/43692] small loop not vectorized

2010-04-08 Thread irar at il dot ibm dot com
--- Comment #5 from irar at il dot ibm dot com 2010-04-08 17:59 --- In GCC 4.4 the smaller loop gets completely unrolled before the vectorizer. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43692

[Bug tree-optimization/43425] enhance scalar expansion to vectorize this loop

2010-03-28 Thread irar at il dot ibm dot com
--- Comment #2 from irar at il dot ibm dot com 2010-03-28 08:59 --- I think PR 35229 covers this issue. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43425

[Bug tree-optimization/43431] Diagnostic message is not clear for vectorization profitability analysis

2010-03-28 Thread irar at il dot ibm dot com
--- Comment #1 from irar at il dot ibm dot com 2010-03-28 09:41 --- (In reply to comment #0) What does this message mean? vector iteration cost = 2056 is divisible by scalar iteration cost = 4 by a factor greater than or equal to the vectorization factor = 4 . Is the vectorization

[Bug tree-optimization/43436] Missed vectorization: unhandled data-ref

2010-03-28 Thread irar at il dot ibm dot com
--- Comment #2 from irar at il dot ibm dot com 2010-03-28 10:58 --- (In reply to comment #0) sub_hfyu_median_prediction.c:18: note: not vectorized: unhandled data-ref Looking with GDB at it, I get: (gdb) p debug_data_references (datarefs) (Data Ref: stmt: D.2736_16 = *D

[Bug tree-optimization/43436] Missed vectorization: unhandled data-ref

2010-03-28 Thread irar at il dot ibm dot com
--- Comment #3 from irar at il dot ibm dot com 2010-03-28 11:07 --- (In reply to comment #1) hadamard8_diff.c:44: note: not vectorized: unhandled data-ref There is a function call in this loop as well. hadamard8_diff.c:26: note: not vectorized: data ref analysis failed D.2771_12

[Bug tree-optimization/43543] Reorder the statements in the loop can vectorize it

2010-03-28 Thread irar at il dot ibm dot com
--- Comment #1 from irar at il dot ibm dot com 2010-03-28 11:16 --- Looks similar to PR 32806. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43543

[Bug tree-optimization/43436] Missed vectorization: unhandled data-ref

2010-03-28 Thread irar at il dot ibm dot com
--- Comment #6 from irar at il dot ibm dot com 2010-03-28 18:05 --- (In reply to comment #4) What about fixing the diagnostic message like this: It would be nice to do the same for SLP (compute_data_dependences_for_bb) for completeness. Thanks, Ira diff --git a/gcc/tree-vect-data

[Bug tree-optimization/43436] Missed vectorization: unhandled data-ref

2010-03-28 Thread irar at il dot ibm dot com
--- Comment #7 from irar at il dot ibm dot com 2010-03-28 18:22 --- (In reply to comment #5) When defining the missing function like this: static inline int mid_pred(int a, int b, int c) { int t= (a-b)((a-b)31); a-=t; b+=t; b-= (b-c)((b-c)31); b+= (a-b

[Bug tree-optimization/42652] vectorizer created unaligned vector insns

2010-02-22 Thread irar at il dot ibm dot com
--- Comment #17 from irar at il dot ibm dot com 2010-02-22 09:01 --- Is there a way to pass alignment information similar to PR 39954? Otherwise, a proper fix would be some inter-procedural analysis... Meantime, we can do intra-procedural analysis and fail when we reach function

[Bug tree-optimization/43074] [4.4/4.5 Regression] ICE in vectorizable_reduction, at tree-vect-loop.c:3491

2010-02-15 Thread irar at il dot ibm dot com
-- irar at il dot ibm dot com changed: What|Removed |Added AssignedTo|unassigned at gcc dot gnu |irar at il dot ibm dot com |dot org

[Bug tree-optimization/42846] GCC sometimes ignores information about pointer target alignment

2010-01-23 Thread irar at il dot ibm dot com
--- Comment #3 from irar at il dot ibm dot com 2010-01-24 07:39 --- This has already been discussed in PR 41464. Ira -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42846

[Bug tree-optimization/42652] vectorizer created unaligned vector insns

2010-01-18 Thread irar at il dot ibm dot com
--- Comment #13 from irar at il dot ibm dot com 2010-01-18 12:17 --- Does something like this make sense? (With this patch we will never use peeling for function parameters, unless the builtin returns OK to peel for packed types). Index: tree-vect-data-refs.c

[Bug tree-optimization/42652] vectorizer created unaligned vector insns

2010-01-13 Thread irar at il dot ibm dot com
--- Comment #10 from irar at il dot ibm dot com 2010-01-13 09:35 --- Yes, I understand that we can't assume that an access is aligned if we can't prove it's aligned. I don't understand how we can prove that a COMPONENT_REF is aligned, i.e., if there is a way to check if a struct

[Bug tree-optimization/42709] [4.5 Regression] error: type mismatch in pointer plus expression

2010-01-13 Thread irar at il dot ibm dot com
-- irar at il dot ibm dot com changed: What|Removed |Added AssignedTo|unassigned at gcc dot gnu |irar at il dot ibm dot com |dot org

[Bug tree-optimization/42652] vectorizer created unaligned vector insns

2010-01-12 Thread irar at il dot ibm dot com
--- Comment #8 from irar at il dot ibm dot com 2010-01-12 08:08 --- So, to be on the safe side, we should assume that COMPONENT_REFs are not naturally aligned and never use peeling for them? -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42652

[Bug tree-optimization/42652] vectorizer created unaligned vector insns

2010-01-10 Thread irar at il dot ibm dot com
--- Comment #5 from irar at il dot ibm dot com 2010-01-10 08:22 --- In vector_alignment_reachable_p() we check if an access is packed using contains_packed_reference(). For packed accesses we return false, meaning alignment is unreachable and peeling cannot be used. In the attached

[Bug middle-end/41082] [4.5 Regression] FAIL: gfortran.fortran-torture/execute/where_2.f90 execution, -O3 -g with -m64

2010-01-10 Thread irar at il dot ibm dot com
--- Comment #43 from irar at il dot ibm dot com 2010-01-10 13:43 --- Since -O2 -ftree-vectorize doesn't cause bad code, it has to be some other optimization on top of vectorized code that causes the problem. Bad code is generated when the alignment of 'reduce' is forced

[Bug middle-end/41082] [4.5 Regression] FAIL: gfortran.fortran-torture/execute/where_2.f90 execution, -O3 -g with -m64

2010-01-05 Thread irar at il dot ibm dot com
--- Comment #42 from irar at il dot ibm dot com 2010-01-05 09:09 --- So, it's enough to force alignment of reduce only (and to vectorize its loop) to get wrong code. On the other hand, the result of the vectorized loop is correct, and the problem is in choosing the correct index of temp

[Bug middle-end/41956] Segfault in vectorizer

2009-12-30 Thread irar at il dot ibm dot com
--- Comment #7 from irar at il dot ibm dot com 2009-12-30 10:16 --- The bug is in SLP load permutation analysis. I am testing a patch. -- irar at il dot ibm dot com changed: What|Removed |Added

[Bug middle-end/41082] [4.5 Regression] FAIL: gfortran.fortran-torture/execute/where_2.f90 execution, -O3 -g with -m64

2009-12-23 Thread irar at il dot ibm dot com
--- Comment #40 from irar at il dot ibm dot com 2009-12-23 14:49 --- (In reply to comment #39) I have regtested the patch in comment #31 and I have ~75 regressions on x86_64-apple-darwin10 in the gcc vect test suite (~100 on powerpc-apple-darwin9). Is this expected? and do you want

[Bug middle-end/41082] [4.5 Regression] FAIL: gfortran.fortran-torture/execute/where_2.f90 execution, -O3 -g with -m64

2009-12-22 Thread irar at il dot ibm dot com
--- Comment #30 from irar at il dot ibm dot com 2009-12-22 11:42 --- We can try to verify the alignment issue by applying the two hacks I am attaching. The first one disables alignment forcing for all the data-refs (and marks the alignment as unknown). The loops are still vectorizable

[Bug middle-end/41082] [4.5 Regression] FAIL: gfortran.fortran-torture/execute/where_2.f90 execution, -O3 -g with -m64

2009-12-22 Thread irar at il dot ibm dot com
--- Comment #31 from irar at il dot ibm dot com 2009-12-22 11:43 --- Created an attachment (id=19370) -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=19370action=view) disable alignment forcing -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=41082

[Bug middle-end/41082] [4.5 Regression] FAIL: gfortran.fortran-torture/execute/where_2.f90 execution, -O3 -g with -m64

2009-12-22 Thread irar at il dot ibm dot com
--- Comment #32 from irar at il dot ibm dot com 2009-12-22 11:44 --- Created an attachment (id=19371) -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=19371action=view) force alignment of vectorized arrays only -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=41082

[Bug middle-end/41082] [4.5 Regression] FAIL: gfortran.fortran-torture/execute/where_2.f90 execution, -O3 -g with -m64

2009-12-22 Thread irar at il dot ibm dot com
--- Comment #36 from irar at il dot ibm dot com 2009-12-23 07:54 --- Thanks! So, it is alignment of the vectorized arrays. I'd like to do two more checks: 1. Just force alignment of the two arrays (temp and reduce) and do not vectorize. 2. Force alignment of reduce only (and vectorize

[Bug middle-end/41082] [4.5 Regression] FAIL: gfortran.fortran-torture/execute/where_2.f90 execution, -O3 -g with -m64

2009-12-22 Thread irar at il dot ibm dot com
--- Comment #37 from irar at il dot ibm dot com 2009-12-23 07:54 --- Created an attachment (id=19377) -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=19377action=view) Force alignment but don't vectorize -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=41082

[Bug middle-end/41082] [4.5 Regression] FAIL: gfortran.fortran-torture/execute/where_2.f90 execution, -O3 -g with -m64

2009-12-22 Thread irar at il dot ibm dot com
--- Comment #38 from irar at il dot ibm dot com 2009-12-23 07:55 --- Created an attachment (id=19378) -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=19378action=view) Force alignment of reduce only -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=41082

[Bug middle-end/41082] [4.5 Regression] FAIL: gfortran.fortran-torture/execute/where_2.f90 execution, -O3 -g with -m64

2009-12-20 Thread irar at il dot ibm dot com
--- Comment #23 from irar at il dot ibm dot com 2009-12-20 12:18 --- The code that now gets vectorized is the summation of array 'reduce': sum(reduce). It looks like the problem is with adding the reduction result to the correct index of 'temp' (scalar code), and not with the reduction

[Bug middle-end/41082] [4.5 Regression] FAIL: gfortran.fortran-torture/execute/where_2.f90 execution, -O3 -g with -m64

2009-12-20 Thread irar at il dot ibm dot com
--- Comment #26 from irar at il dot ibm dot com 2009-12-20 13:46 --- I think the problem is in alignment. We force alignment of temp.6 and temp.20 - the arrays of relevant comaprison results - even though we don't vectorize their loop. The decision whether we can force alignment is made

[Bug middle-end/41082] [4.5 Regression] FAIL: gfortran.fortran-torture/execute/where_2.f90 execution, -O3 -g with -m64

2009-12-20 Thread irar at il dot ibm dot com
--- Comment #28 from irar at il dot ibm dot com 2009-12-20 13:59 --- Hm, I don't know, but this is my best guess - we change something in the code that goes wrong... We also force alignment of reduce, but the reduction computation looks ok. -- http://gcc.gnu.org/bugzilla

[Bug middle-end/41082] [4.5 Regression] FAIL: gfortran.fortran-torture/execute/where_2.f90 execution, -O3 -g with -m64

2009-12-16 Thread irar at il dot ibm dot com
--- Comment #21 from irar at il dot ibm dot com 2009-12-16 12:01 --- Thanks. I'll be able to look at this only on Sunday due to holidays. Ira -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=41082

[Bug middle-end/41082] [4.5 Regression] FAIL: gfortran.fortran-torture/execute/where_2.f90 execution, -O3 -g with -m64

2009-12-15 Thread irar at il dot ibm dot com
--- Comment #7 from irar at il dot ibm dot com 2009-12-15 08:25 --- I can't reproduce it with current mainline on powerpc64-suse-linux. Could you please attach vectorizer dump? Does the good old version gets vectorized? If so, could you please attach it as well? Thanks, Ira

[Bug middle-end/41082] [4.5 Regression] FAIL: gfortran.fortran-torture/execute/where_2.f90 execution, -O3 -g with -m64

2009-12-15 Thread irar at il dot ibm dot com
--- Comment #11 from irar at il dot ibm dot com 2009-12-15 10:59 --- Looks that it has to be my patch that enables vectorization of conditions: r149806 | irar | 2009-07-20 14:59:10 +0300 (Mon, 20 Jul 2009) | 19 lines * tree-vectorizer.h (vectorizable_condition): Add

[Bug middle-end/41082] [4.5 Regression] FAIL: gfortran.fortran-torture/execute/where_2.f90 execution, -O3 -g with -m64

2009-12-15 Thread irar at il dot ibm dot com
--- Comment #13 from irar at il dot ibm dot com 2009-12-15 13:07 --- (In reply to comment #12) Looks that it has to be my patch that enables vectorization of conditions: I am doing a clean bootstrap of C and FORTRAN of revision 149805 to see if the test works for it (allow for ~6h

[Bug middle-end/41082] [4.5 Regression] FAIL: gfortran.fortran-torture/execute/where_2.f90 execution, -O3 -g with -m64

2009-12-15 Thread irar at il dot ibm dot com
--- Comment #14 from irar at il dot ibm dot com 2009-12-15 13:08 --- Created an attachment (id=19311) -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=19311action=view) powerpc64-suse-linux vect dump -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=41082

[Bug middle-end/41082] [4.5 Regression] FAIL: gfortran.fortran-torture/execute/where_2.f90 execution, -O3 -g with -m64

2009-12-15 Thread irar at il dot ibm dot com
--- Comment #16 from irar at il dot ibm dot com 2009-12-15 13:35 --- But in comment #5 you wrote that it passes with the print, right? So, this dump contains correct or incorrect code? -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=41082

[Bug tree-optimization/42286] October 23rd change to tree-ssa-pre.c breaks calculix on powerpc with -ffast-math

2009-12-06 Thread irar at il dot ibm dot com
--- Comment #3 from irar at il dot ibm dot com 2009-12-06 13:25 --- On powerpc64-suse-linux with current trunk calculix failed after a couple of minutes with -O3 -maltivec -ffast-math -O3 -maltivec -ffast-math -fno-tree-vectorize -O2 -maltivec -ffast-math -O1 -maltivec -ffast-math

[Bug tree-optimization/42108] [4.4/4.5 Regression] Vectorizer cannot deal with PAREN_EXPR gracefully, 50% performance regression

2009-11-30 Thread irar at il dot ibm dot com
--- Comment #20 from irar at il dot ibm dot com 2009-11-30 08:52 --- Actually, PAREN_EXPRs are vectorizable (the support was added by you, Richard, in your original PAREN_EXPR patch http://gcc.gnu.org/viewcvs?limit_changes=0view=revisionrevision=132515 )). The problem here

[Bug tree-optimization/42108] [4.4/4.5 Regression] Vectorizer cannot deal with PAREN_EXPR gracefully, 50% performance regression

2009-11-30 Thread irar at il dot ibm dot com
--- Comment #21 from irar at il dot ibm dot com 2009-11-30 08:54 --- Created an attachment (id=19183) -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=19183action=view) Multiple types support patch -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42108

[Bug tree-optimization/42108] [4.4/4.5 Regression] Vectorizer cannot deal with PAREN_EXPR gracefully, 50% performance regression

2009-11-30 Thread irar at il dot ibm dot com
--- Comment #23 from irar at il dot ibm dot com 2009-11-30 12:20 --- Applied: http://gcc.gnu.org/viewcvs?limit_changes=0view=revisionrevision=154794 Thanks, Ira -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42108

[Bug middle-end/42193] [4.5 Regression] 454.calculix in SPEC CPU 2006 failed to compile at -O3

2009-11-29 Thread irar at il dot ibm dot com
-- irar at il dot ibm dot com changed: What|Removed |Added AssignedTo|unassigned at gcc dot gnu |irar at il dot ibm dot com |dot org

[Bug tree-optimization/42108] [4.4/4.5 Regression] Vectorizer cannot deal with PAREN_EXPR gracefully, 50% performance regression

2009-11-23 Thread irar at il dot ibm dot com
--- Comment #18 from irar at il dot ibm dot com 2009-11-23 09:02 --- I tried to vectorize eval.f90 with 4.3 and mainline on x86_64-suse-linux. In both cases no loop gets vectorized in subroutine eval. The k loop is not vectorizable because the step of x is unknown (function argument

[Bug tree-optimization/41879] [4.5 Regression] 172.mgrid regression, vectorizer prevents predictive commoning

2009-11-11 Thread irar at il dot ibm dot com
--- Comment #5 from irar at il dot ibm dot com 2009-11-12 07:51 --- (In reply to comment #4) I didn't check yet. We'll work on a simple cost-model integration of predcom. You mean, vectorizer cost model will take predcom into account? If the vectorization is not profitable (vs

[Bug tree-optimization/41879] [4.5 Regression] 172.mgrid regression, vectorizer prevents predictive commoning

2009-11-10 Thread irar at il dot ibm dot com
--- Comment #3 from irar at il dot ibm dot com 2009-11-10 10:02 --- (In reply to comment #0) This causes mgrid score to drop by almost 40% on x86_64 and the vectorized code is pretty bad because it uses unaligned accesses. Is the vectorized code worse than the scalar one even

[Bug tree-optimization/41464] vector loads are unnecessarily split into high and low loads

2009-09-27 Thread irar at il dot ibm dot com
--- Comment #4 from irar at il dot ibm dot com 2009-09-27 08:06 --- (In reply to comment #1) The interesting thing is that data-ref analysis sees 128bit alignment but the vectorizer still produces vect_var_.24_59 = M*vect_p.20_57{misalignment: 0}; D.2564_12 = *D.2563_11

[Bug tree-optimization/41464] vector loads are unnecessarily split into high and low loads

2009-09-27 Thread irar at il dot ibm dot com
--- Comment #6 from irar at il dot ibm dot com 2009-09-27 09:56 --- (In reply to comment #5) aligned to refers to the offset misalignment and not to the misalignment of base. Hmm, I believe it refers to base + offset + constant offset. tree-data-refs.h: /* Alignment

[Bug target/41288] [4.5 Regression] gcc.target/x86_64/abi/test_struct_returning.c regressions on *-apple-darwin* at -m64

2009-09-07 Thread irar at il dot ibm dot com
--- Comment #9 from irar at il dot ibm dot com 2009-09-08 05:51 --- Looks related to PR 39907. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=41288

[Bug tree-optimization/41019] Variate_generator with mt19937 and normal_distribution produces wrong sequence for -O3.

2009-08-13 Thread irar at il dot ibm dot com
--- Comment #10 from irar at il dot ibm dot com 2009-08-13 11:34 --- Reduced testcase: #include stdlib.h #include stdio.h #define N 4 long int a[N]; int main () { int k; for (k = 0; k N; ++k) a[k] = a[k] != 5 ? 12 : 10; for (k = 0; k N; ++k) printf (%u , a[k

[Bug tree-optimization/41019] Variate_generator with mt19937 and normal_distribution produces wrong sequence for -O3.

2009-08-13 Thread irar at il dot ibm dot com
--- Comment #11 from irar at il dot ibm dot com 2009-08-13 11:36 --- Created an attachment (id=18350) -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=18350action=view) The assembly for the long int version (wrong code) -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=41019

[Bug tree-optimization/41019] Variate_generator with mt19937 and normal_distribution produces wrong sequence for -O3.

2009-08-13 Thread irar at il dot ibm dot com
--- Comment #12 from irar at il dot ibm dot com 2009-08-13 11:37 --- Created an attachment (id=18351) -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=18351action=view) The assembly for the int version (correct) -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=41019

[Bug tree-optimization/41019] Variate_generator with mt19937 and normal_distribution produces wrong sequence for -O3.

2009-08-12 Thread irar at il dot ibm dot com
--- Comment #6 from irar at il dot ibm dot com 2009-08-12 12:14 --- Looks like a problem in data-ref analysis: Creating dr for this_6(D)-_M_x[__k_87] ... base_address: this_6(D) offset from base address: 0 constant offset from base address: 0 step: 8

[Bug tree-optimization/41008] [4.5 Regression] ICE in vect_is_simple_reduction, at tree-vect-loop.c:1708

2009-08-09 Thread irar at il dot ibm dot com
--- Comment #3 from irar at il dot ibm dot com 2009-08-09 12:15 --- Fixed. -- irar at il dot ibm dot com changed: What|Removed |Added Status|UNCONFIRMED

[Bug middle-end/37150] vectorizer misses some loops

2009-08-06 Thread irar at il dot ibm dot com
--- Comment #10 from irar at il dot ibm dot com 2009-08-06 10:49 --- Yes. The problem is that only a basic implementation was added. To vectorize this code several improvements must be done: support stmt group sizes greater than vector size, allow loads and stores to the same location

[Bug fortran/31067] MINLOC should sometimes be inlined (gas_dyn is sooooo sloooow)

2009-07-28 Thread irar at il dot ibm dot com
--- Comment #41 from irar at il dot ibm dot com 2009-07-28 08:12 --- That requires pattern recognition. MIN/MAX_EXPR are recognized by the first phiopt pass, so MIN/MAXLOC should be either also recognized there or in the vectorizer. (The phiopt pass transforms if clause to MIN/MAX_EXPR

[Bug fortran/31067] MINLOC should sometimes be inlined (gas_dyn is sooooo sloooow)

2009-07-27 Thread irar at il dot ibm dot com
--- Comment #34 from irar at il dot ibm dot com 2009-07-27 08:36 --- (In reply to comment #33) Using the example from comment 23 with ... gfortran shows: test.f90:12: note: not vectorized: unsupported use in stmt. and needs 2.272s. (By comparison. 4.4 needs 3.688s

[Bug fortran/31067] MINLOC should sometimes be inlined (gas_dyn is sooooo sloooow)

2009-07-27 Thread irar at il dot ibm dot com
--- Comment #38 from irar at il dot ibm dot com 2009-07-27 12:44 --- I am not sure that that kind of computation can be generated automatically, since in general the order of caclulation of cond_expr cannot be changed. However, the loop can be split: for (i = 0; i end; i

[Bug tree-optimization/40801] internal compiler error: in vect_get_vec_def_for_stmt_copy, at tree-vect-stmts.c:1096

2009-07-26 Thread irar at il dot ibm dot com
--- Comment #5 from irar at il dot ibm dot com 2009-07-26 07:04 --- Fixed. -- irar at il dot ibm dot com changed: What|Removed |Added Status|ASSIGNED

[Bug fortran/31067] MINLOC should sometimes be inlined (gas_dyn is sooooo sloooow)

2009-07-26 Thread irar at il dot ibm dot com
--- Comment #32 from irar at il dot ibm dot com 2009-07-26 07:48 --- (In reply to comment #30) Regarding the just committed inline version: It would be interesting to know whether it is vectorizable (with/without -ffinite-math-only [i.e. -ffast-math]). It depends on where

[Bug tree-optimization/40770] Vectorization of complex types, vectorization of sincos missing

2009-07-20 Thread irar at il dot ibm dot com
--- Comment #7 from irar at il dot ibm dot com 2009-07-20 11:18 --- AFAIU, querying for the component type of complex type is not difficult to implement. I think, that loop-based vectorization is preferable here, so we should stay with vectorization factor of 2 for doubles. The next

[Bug fortran/31067] MINLOC should sometimes be inlined (gas_dyn is sooooo sloooow)

2009-07-20 Thread irar at il dot ibm dot com
--- Comment #28 from irar at il dot ibm dot com 2009-07-20 12:03 --- I've just committed a patch that adds support of cond_expr in reductions in nested cycles (http://gcc.gnu.org/ml/gcc-patches/2009-07/msg01124.html). cond_expr cannot be vectorized in reduction of inner-most loop

[Bug tree-optimization/40801] internal compiler error: in vect_get_vec_def_for_stmt_copy, at tree-vect-stmts.c:1096

2009-07-19 Thread irar at il dot ibm dot com
--- Comment #3 from irar at il dot ibm dot com 2009-07-19 09:35 --- Testing a fix. Ira -- irar at il dot ibm dot com changed: What|Removed |Added AssignedTo

[Bug tree-optimization/40770] Vectorization of complex types, vectorization of sincos missing

2009-07-16 Thread irar at il dot ibm dot com
--- Comment #2 from irar at il dot ibm dot com 2009-07-16 12:29 --- pr40770.c:20: note: == examining statement: sincostmp.21_1 = __builtin_cexpi (D.1625_3); pr40770.c:20: note: get vectype for scalar type: complex double pr40770.c:20: note: not vectorized: unsupported data-type complex

[Bug tree-optimization/40770] Vectorization of complex types, vectorization of sincos missing

2009-07-16 Thread irar at il dot ibm dot com
--- Comment #6 from irar at il dot ibm dot com 2009-07-16 17:31 --- (In reply to comment #3) make_vector_type returns NULL for this type. Yes - there is no vector type for complex double. But the vectorizer could query for a vector type for the complex component type (double

  1   2   3   4   >