[Bug tree-optimization/37416] [4.4 Regression] Failure to return number of loop iterations
--- Comment #2 from irar at il dot ibm dot com 2008-11-22 15:08 --- (In reply to comment #1) > This bug is shamefully incomplete. There is no way anyone willing to give > this > a look can know what to look for. > For example, a few things one would have to know before he/she can even begin > to consider whether/how to analyze the problem: > 1. What is the target where you see this? > 2. What compiler flags are you using? -O3 > 3. Where do you look for the number of iterations (which dump)? vectorizer's dump > 4. What "missed-optimization" does this cause (something not vectorized)? the loop is not vectorized because the number of iterations is unknown > Please read http://gcc.gnu.org/bugs.html#report before filing more bugs. -- irar at il dot ibm dot com changed: What|Removed |Added GCC build triplet||x86_64-suse-linux GCC host triplet||x86_64-suse-linux GCC target triplet||x86_64-suse-linux http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37416
[Bug tree-optimization/38464] [4.4 Regression] vect/costmodel/ppc/costmodel-slp-12.c fails to vectorize
--- Comment #2 from irar at il dot ibm dot com 2008-12-11 08:02 --- Fixed. -- irar at il dot ibm dot com changed: What|Removed |Added Status|NEW |RESOLVED Resolution||FIXED http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38464
[Bug tree-optimization/38529] [4.3/4.4 regression] ICE with nested loops
-- irar at il dot ibm dot com changed: What|Removed |Added Status|UNCONFIRMED |NEW Ever Confirmed|0 |1 Last reconfirmed|-00-00 00:00:00 |2008-12-15 08:26:30 date|| http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38529
[Bug tree-optimization/38529] [4.3/4.4 regression] ICE with nested loops
-- irar at il dot ibm dot com changed: What|Removed |Added AssignedTo|unassigned at gcc dot gnu |irar at il dot ibm dot com |dot org | Status|NEW |ASSIGNED Last reconfirmed|2008-12-15 08:26:30 |2008-12-15 14:42:26 date|| http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38529
[Bug tree-optimization/37194] [4.3/4.4 Regression] Autovectorization of small constant iteration loop degrades performance
--- Comment #7 from irar at il dot ibm dot com 2008-12-30 14:57 --- (In reply to comment #6) > t.i:3: note: Vectorization may not be profitable. > why doesn't the cost model then disallow vectorization here? This is misleading. It only means that there exists loop bound threshold either defined by the user or calculated with the cost model. It does not mean that the cost model's decision is that the vectorization is not profitable. I am adding this to our cleanup todo list. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37194
[Bug tree-optimization/37194] [4.3/4.4 Regression] Autovectorization of small constant iteration loop degrades performance
--- Comment #8 from irar at il dot ibm dot com 2009-01-05 13:58 --- To handle unknown alignment of data, the vectorizer creates a prolog loop to peel a statically unknown number of scalar iterations (0<=nhttp://gcc.gnu.org/bugzilla/show_bug.cgi?id=37194
[Bug tree-optimization/38721] [alias-improvements] vectorizer miscompiles gfortran.fortran-torture/execute/elemental.f90 at -O3
--- Comment #1 from irar at il dot ibm dot com 2009-01-05 13:19 --- Here is a reduced testcase: program test_elemental implicit none integer, dimension (2, 4) :: a integer, dimension (2, 4) :: b integer(kind = 8), dimension(2) :: c a = reshape ((/2, 3, 4, 5, 6, 7, 8, 9/), (/2, 4/)) b = 0 a = e_fn (a(:, 4:1:-1), 1 + b) ! This tests intrinsic elemental conversion functions. c = 2 * a(1, 1) if (any (c .ne. 14)) call abort ! This triggered bug due to building ss chains in the wrong order. b = 0; a = a - e_fn (a, b) if (any (a .ne. 0)) call abort contains elemental integer(kind=4) function e_fn (p, q) integer, intent(in) :: p, q e_fn = p - q end function end program The problem is that dse2 removes the stores to array A.4 which is used by the vectorized code: A.4[0] = D.1635_155; ... A.4[7] = D.1635_165; vect_pA.67_156 = (vector integer(kind=4) *) &A.4; vect_pa.73_197 = (vector integer(kind=4) *) &a; vect_var_.68_254 = *vect_pA.67_156; *vect_pa.73_197 = vect_var_.68_254; vect_pA.63_256 = vect_pA.67_156 + 16; vect_pa.69_257 = vect_pa.73_197 + 16; vect_var_.68_170 = *vect_pA.63_256; *vect_pa.69_257 = vect_var_.68_170; We propagate alias info from the scalar to vector ref in vect_create_data_ref_ptr() (in tree-vect-transform.c): /** (2) Add aliasing information to the new vector-pointer: (The points-to info (DR_PTR_INFO) may be defined later.) **/ tag = DR_SYMBOL_TAG (dr); gcc_assert (tag); /* If tag is a variable (and NOT_A_TAG) than a new symbol memory tag must be created with tag added to its may alias list. */ if (!MTAG_P (tag)) new_type_alias (vect_ptr, tag, DR_REF (dr)); else set_symbol_mem_tag (vect_ptr, tag); Those lines do not exist on the branch. Do you take care of this somewhere else? Ira -- irar at il dot ibm dot com changed: What|Removed |Added Status|UNCONFIRMED |NEW Ever Confirmed|0 |1 Last reconfirmed|-00-00 00:00:00 |2009-01-05 13:19:53 date|| http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38721
[Bug tree-optimization/37194] [4.3/4.4 Regression] Autovectorization of small constant iteration loop degrades performance
--- Comment #12 from irar at il dot ibm dot com 2009-01-08 09:25 --- (In reply to comment #11) > fixed for 4.3.3? > Thanks. No, still waiting for approval. -- irar at il dot ibm dot com changed: What|Removed |Added Status|RESOLVED|REOPENED Resolution|FIXED | http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37194
[Bug tree-optimization/38529] [4.3 regression] ICE with nested loops
--- Comment #4 from irar at il dot ibm dot com 2009-01-11 07:48 --- Fixed on 4.3 branch as well. -- irar at il dot ibm dot com changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution||FIXED http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38529
[Bug tree-optimization/37194] [4.3 Regression] Autovectorization of small constant iteration loop degrades performance
--- Comment #14 from irar at il dot ibm dot com 2009-01-11 07:57 --- Fixed. -- irar at il dot ibm dot com changed: What|Removed |Added Status|REOPENED|RESOLVED Resolution||FIXED http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37194
[Bug tree-optimization/37021] Fortran Complex reduction / multiplication not vectorized
--- Comment #6 from irar at il dot ibm dot com 2009-01-25 09:12 --- (In reply to comment #5) > So, > 4) The vectorized version sucks because we have to use peeling for niters > because we need to unroll the loop once and cannot apply SLP here. What do you mean by "unroll the loop once"? > Q1: does SLP work with reductions at all? No. SLP currently originates from groups of strided stores. > Q2: does SLP do pattern recognition? Pattern recoginition is done before SLP, and SLP handles stmts that were marked as a part of a pattern. There is no SLP specific pattern recoginition. > First of all we would need to recognize a complex reduction as a single > vectorized reduction. Second we need to vectorize the complex multiplication > with SLP, feeding the reduction with one resulting complex vector. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37021
[Bug tree-optimization/37021] Fortran Complex reduction / multiplication not vectorized
--- Comment #8 from irar at il dot ibm dot com 2009-01-25 12:17 --- (In reply to comment #7) > > > Q1: does SLP work with reductions at all? > > > > No. SLP currently originates from groups of strided stores. > Ah, I see. In this loop we have two reductions, so to apply SLP > we would need to see that we can use a group of reductions for SLP? Yes, I think this will work. > > > Q2: does SLP do pattern recognition? > > > > Pattern recoginition is done before SLP, and SLP handles stmts that were > > marked > > as a part of a pattern. There is no SLP specific pattern recoginition. > Ok, but with a reduction it won't help me here. > Can a loop be vectorized with just pattern recognition? Hm, if I > remember correctly we detect scalar patterns and then vectorize them. > We don't support detecting "vector patterns" from scalar code, correct? Yes, if I understand you correctly, we detect scalar patterns, but adding vector pattern detection does not seem to be complicated. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37021
[Bug tree-optimization/38968] Complex matrix product is not vectorized
--- Comment #3 from irar at il dot ibm dot com 2009-01-26 13:09 --- (In reply to comment #2) > Now, I wonder why we do not just use alignment + misalign in that case. I think you are right. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38968
[Bug middle-end/40021] [4.5 Regression] Revision 146817 miscompiled DAXPY in BLAS
--- Comment #6 from irar at il dot ibm dot com 2009-05-05 12:41 --- Reproduced on x86_64-suse-linux. Seems that, somehow, the vectorized version of loop in line 29 is performed, even though the number of scalar iterations is 1. -- irar at il dot ibm dot com changed: What|Removed |Added CC||irar at il dot ibm dot com Status|UNCONFIRMED |NEW Ever Confirmed|0 |1 Last reconfirmed|-00-00 00:00:00 |2009-05-05 12:41:15 date|| http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40021
[Bug tree-optimization/40074] [4.4/4.5 Regression] ICE in vect_get_vec_def_for_operand, at tree-vect-stmts.c:944
--- Comment #13 from irar at il dot ibm dot com 2009-05-10 09:20 --- (In reply to comment #12) > Well, that revision only enabled vectorization support for more things... > (which is probably what makes this a regression in the first place). Right, I think it is something in the strided accesses detection. I am looking into it now. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40074
[Bug tree-optimization/40074] [4.4/4.5 Regression] ICE in vect_get_vec_def_for_operand, at tree-vect-stmts.c:944
--- Comment #14 from irar at il dot ibm dot com 2009-05-10 11:00 --- I am testing: Index: tree-vect-data-refs.c === --- tree-vect-data-refs.c (revision 147329) +++ tree-vect-data-refs.c (working copy) @@ -1424,7 +1424,7 @@ vect_analyze_group_access (struct data_r /* First stmt in the interleaving chain. Check the chain. */ gimple next = DR_GROUP_NEXT_DR (vinfo_for_stmt (stmt)); struct data_reference *data_ref = dr; - unsigned int count = 1; + unsigned int count = 1, gaps = 0; tree next_step; tree prev_init = DR_INIT (data_ref); gimple prev = stmt; @@ -1490,6 +1490,8 @@ vect_analyze_group_access (struct data_r fprintf (vect_dump, "interleaved store with gaps"); return false; } + + gaps += diff - 1; } /* Store the gap from the previous member of the group. If there is no @@ -1506,8 +1508,9 @@ vect_analyze_group_access (struct data_r the type to get COUNT_IN_BYTES. */ count_in_bytes = type_size * count; - /* Check that the size of the interleaving is not greater than STEP. */ - if (dr_step < count_in_bytes) + /* Check that the size of the interleaving (including gaps) is not greater + than STEP. */ + if (dr_step && dr_step < count_in_bytes + gaps * type_size) { if (vect_print_dump_info (REPORT_DETAILS)) { It fixes the reduced testcase, but I failed to compile the original one, so maybe someone could check that the above patch fixes the ICE for the original testcase? Thanks, Ira -- irar at il dot ibm dot com changed: What|Removed |Added AssignedTo|unassigned at gcc dot gnu |irar at il dot ibm dot com |dot org | Status|NEW |ASSIGNED Last reconfirmed|2009-05-08 20:59:57 |2009-05-10 11:00:34 date|| http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40074
[Bug tree-optimization/40074] [4.4/4.5 Regression] ICE in vect_get_vec_def_for_operand, at tree-vect-stmts.c:944
--- Comment #18 from irar at il dot ibm dot com 2009-05-11 12:45 --- Fixed. -- irar at il dot ibm dot com changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution||FIXED http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40074
[Bug tree-optimization/40233] New: Test failures with "alignment of array elements is greater than element size"
for excess errors) WARNING: g++.dg/torture/stackalign/eh-alloca-1.C -O3 -g compilation failed to produce executable FAIL: g++.dg/torture/stackalign/eh-global-1.C -O3 -fomit-frame-pointer (test for excess errors) WARNING: g++.dg/torture/stackalign/eh-global-1.C -O3 -fomit-frame-pointer compilation failed to produce executable FAIL: g++.dg/torture/stackalign/eh-global-1.C -O3 -g (test for excess errors) WARNING: g++.dg/torture/stackalign/eh-global-1.C -O3 -g compilation failed to produce executable FAIL: g++.dg/torture/stackalign/eh-inline-1.C -O3 -fomit-frame-pointer (test for excess errors) WARNING: g++.dg/torture/stackalign/eh-inline-1.C -O3 -fomit-frame-pointer compilation failed to produce executable FAIL: g++.dg/torture/stackalign/eh-inline-1.C -O3 -g (test for excess errors) WARNING: g++.dg/torture/stackalign/eh-inline-1.C -O3 -g compilation failed to produce executable FAIL: g++.dg/torture/stackalign/eh-inline-2.C -O3 -fomit-frame-pointer (test for excess errors) WARNING: g++.dg/torture/stackalign/eh-inline-2.C -O3 -fomit-frame-pointer compilation failed to produce executable FAIL: g++.dg/torture/stackalign/eh-inline-2.C -O3 -g (test for excess errors) WARNING: g++.dg/torture/stackalign/eh-inline-2.C -O3 -g compilation failed to produce executable FAIL: g++.dg/torture/stackalign/eh-vararg-1.C -O3 -fomit-frame-pointer (test for excess errors) WARNING: g++.dg/torture/stackalign/eh-vararg-1.C -O3 -fomit-frame-pointer compilation failed to produce executable FAIL: g++.dg/torture/stackalign/eh-vararg-1.C -O3 -g (test for excess errors) WARNING: g++.dg/torture/stackalign/eh-vararg-1.C -O3 -g compilation failed to produce executable FAIL: g++.dg/torture/stackalign/eh-vararg-2.C -O3 -fomit-frame-pointer (test for excess errors) WARNING: g++.dg/torture/stackalign/eh-vararg-2.C -O3 -fomit-frame-pointer compilation failed to produce executable FAIL: g++.dg/torture/stackalign/eh-vararg-2.C -O3 -g (test for excess errors) WARNING: g++.dg/torture/stackalign/eh-vararg-2.C -O3 -g compilation failed to produce executable The failures start from revision 147829 - basic block SLP. SLP checks if there is a vector type for the scalar type used in a basic block. It calls make_vector_type() for a vector type, where array of this type is built for debug representation purposes in build_array_type(): at ../../gcc/gcc/stor-layout.c:1848 1848 error ("alignment of array elements is greater than element size"); (gdb) back #0 layout_type (type=0x2b2860eb2240) at ../../gcc/gcc/stor-layout.c:1848 #1 0x008dc33c in type_hash_lookup (hashcode=2524125531, type=0x40) at ../../gcc/gcc/tree.c:4721 #2 0x008dc3c9 in type_hash_canon (hashcode=2524125531, type=0x40) at ../../gcc/gcc/tree.c:4772 #3 0x008dd1d1 in build_array_type (elt_type=0x2b2860e52600, index_type=0x2b2860dd90c0) at ../../gcc/gcc/tree.c:5851 #4 0x008f4d1d in make_vector_type (innertype=0x2b2860e52600, nunits=4, mode=VOIDmode) at ../../gcc/gcc/tree.c:7441 #5 0x0089d9c8 in get_vectype_for_scalar_type (scalar_type=0x2b2860e52600) at ../../gcc/gcc/tree-vect-stmts.c:4348 #6 0x00bbc3ef in vect_analyze_data_refs (loop_vinfo=, bb_vinfo=) at ../../gcc/gcc/tree-vect-data-refs.c:2050 ... (gdb) p debug_generic_expr (type) aligned[4] $6 = void -- Summary: Test failures with "alignment of array elements is greater than element size" Product: gcc Version: 4.5.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: irar at il dot ibm dot com GCC build triplet: x86_64-suse-linux GCC host triplet: x86_64-suse-linux GCC target triplet: x86_64-suse-linux http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40233
[Bug middle-end/40240] [4.5 regression] ICE in execute_cse_reciprocals, at tree-ssa-math-opts.c:469
--- Comment #2 from irar at il dot ibm dot com 2009-05-25 08:20 --- (In reply to comment #1) > this is likely being fixed by Ira I committed the fix. Could you please check if it really fixes this one as well? Thanks, Ira -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40240
[Bug middle-end/40244] [4.5 Regression] Revision147829 caused extra failures
--- Comment #1 from irar at il dot ibm dot com 2009-05-26 08:58 --- (In reply to comment #0) > On Linux/ia64, revision 147829: > http://gcc.gnu.org/ml/gcc-cvs/2009-05/msg00806.html > caused: > FAIL: Matrix4f -O3 compilation from source Could you please provide some information, it doesn't fail on x86_64... > FAIL: gcc.dg/vect/bb-slp-10.c scan-tree-dump-times slp "unsupported alignment > in basic block." 1 > FAIL: gcc.dg/vect/bb-slp-4.c scan-tree-dump-times slp "basic block vectorized > using SLP" 0 I think they can be fixed as following. Could you please check? Index: testsuite/gcc.dg/vect/bb-slp-4.c === --- testsuite/gcc.dg/vect/bb-slp-4.c(revision 147862) +++ testsuite/gcc.dg/vect/bb-slp-4.c(working copy) @@ -18,14 +18,10 @@ main1 () *pout++ = *pin++; *pout++ = *pin++; - *pout++ = *pin++; - *pout++ = *pin++; /* Check results. */ if (out[0] != in[0] - || out[1] != in[1] - || out[2] != in[2] - || out[3] != in[3]) + || out[1] != in[1]) abort(); return 0; Index: testsuite/gcc.dg/vect/bb-slp-10.c === --- testsuite/gcc.dg/vect/bb-slp-10.c (revision 147862) +++ testsuite/gcc.dg/vect/bb-slp-10.c (working copy) @@ -14,7 +14,7 @@ main1 (unsigned int x, unsigned int y) { int i; unsigned int *pin = &in[0]; - unsigned int *pout = &out[2]; + unsigned int *pout = &out[1]; unsigned int a0, a1, a2, a3; /* Misaligned store. */ @@ -29,10 +29,10 @@ main1 (unsigned int x, unsigned int y) *pout++ = a3 * y; /* Check results. */ - if (out[2] != (in[0] + 23) * x - || out[3] != (in[1] + 142) * y - || out[4] != (in[2] + 2) * x - || out[5] != (in[3] + 31) * y) + if (out[1] != (in[0] + 23) * x + || out[2] != (in[1] + 142) * y + || out[3] != (in[2] + 2) * x + || out[4] != (in[3] + 31) * y) abort(); return 0; Thanks, Ira -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40244
[Bug tree-optimization/40254] [4.5 Regression] SPEC2006 403.gcc miscompares
--- Comment #4 from irar at il dot ibm dot com 2009-05-27 08:43 --- The bug is in data-refs analysis for basic blocks: two accesses that are not adjacent (reload.c:1370) are considered as adjacent, and, therefore, get vectorized together, causing the wrong code generation. -- irar at il dot ibm dot com changed: What|Removed |Added AssignedTo|unassigned at gcc dot gnu |irar at il dot ibm dot com |dot org | Status|UNCONFIRMED |ASSIGNED Ever Confirmed|0 |1 Last reconfirmed|-00-00 00:00:00 |2009-05-27 08:43:46 date|| http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40254
[Bug tree-optimization/40254] [4.5 Regression] SPEC2006 403.gcc miscompares
--- Comment #5 from irar at il dot ibm dot com 2009-05-27 09:59 --- I'll test this patch tomorrow: Index: tree-data-ref.c === --- tree-data-ref.c (revision 147903) +++ tree-data-ref.c (working copy) @@ -718,17 +725,26 @@ dr_analyze_innermost (struct data_refere base_iv.no_overflow = true; } - if (!poffset || !in_loop) + if (!poffset) { offset_iv.base = ssize_int (0); offset_iv.step = ssize_int (0); } - else if (!simple_iv (loop, loop_containing_stmt (stmt), - poffset, &offset_iv, false)) + else { - if (dump_file && (dump_flags & TDF_DETAILS)) - fprintf (dump_file, "failed: evolution of offset is not affine.\n"); - return false; + if (!in_loop) +{ + offset_iv.base = poffset; + offset_iv.step = ssize_int (0); +} + else if (!simple_iv (loop, loop_containing_stmt (stmt), + poffset, &offset_iv, false)) +{ + if (dump_file && (dump_flags & TDF_DETAILS)) +fprintf (dump_file, "failed: evolution of offset is not" +" affine.\n"); + return false; +} } init = ssize_int (pbitpos / BITS_PER_UNIT); -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40254
[Bug middle-end/40244] [4.5 Regression] Revision 147829 caused extra failures
--- Comment #5 from irar at il dot ibm dot com 2009-05-30 16:53 --- (In reply to comment #4) > (In reply to comment #1) > > (In reply to comment #0) > > > On Linux/ia64, revision 147829: > > > http://gcc.gnu.org/ml/gcc-cvs/2009-05/msg00806.html > > > caused: > > > FAIL: Matrix4f -O3 compilation from source > > > > Could you please provide some information, it doesn't fail on x86_64... > > > > > FAIL: gcc.dg/vect/bb-slp-10.c scan-tree-dump-times slp "unsupported > > > alignment > > > in basic block." 1 > > > FAIL: gcc.dg/vect/bb-slp-4.c scan-tree-dump-times slp "basic block > > > vectorized > > > using SLP" 0 > > > > I think they can be fixed as following. Could you please check? > > > Yes, it fixed the problem. Thanks. Thanks. Is Matrix4f OK now too? -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40244
[Bug testsuite/40244] [4.5 Regression] Revision 147829 caused extra failures
-- irar at il dot ibm dot com changed: What|Removed |Added AssignedTo|unassigned at gcc dot gnu |irar at il dot ibm dot com |dot org | Status|NEW |ASSIGNED Component|middle-end |testsuite Last reconfirmed|2009-05-29 07:52:46 |2009-05-31 06:45:04 date|| http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40244
[Bug testsuite/40244] [4.5 Regression] Revision 147829 caused extra failures
--- Comment #8 from irar at il dot ibm dot com 2009-05-31 09:04 --- Fixed. -- irar at il dot ibm dot com changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution||FIXED http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40244
[Bug tree-optimization/39129] The meaning of 'BB' in "too many BBs in loop"
--- Comment #4 from irar at il dot ibm dot com 2009-05-31 10:55 --- So, will "too many basic blocks in loop" be good enough? Because this is what it is, the reason that the loop form is not suitable for the vectorizer is that there are too many basic blocks in it. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39129
[Bug tree-optimization/39129] The meaning of 'BB' in "too many BBs in loop"
--- Comment #6 from irar at il dot ibm dot com 2009-05-31 12:33 --- For non-empty latch block we actually print "not vectorized: unexpected loop form." So I can change it to "not vectorized: non-empty latch block", and instead of "too many BBs" I can write "control flow in loop". -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39129
[Bug tree-optimization/39129] The meaning of 'BB' in "too many BBs in loop"
--- Comment #9 from irar at il dot ibm dot com 2009-06-01 08:20 --- Fixed. -- irar at il dot ibm dot com changed: What|Removed |Added Status|NEW |RESOLVED Resolution||FIXED http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39129
[Bug tree-optimization/40348] Powerpc spe segfaults in vectorizing powf (a[i], 0.5f)
--- Comment #2 from irar at il dot ibm dot com 2009-06-07 07:59 --- So, I guess this patch fixes it? Thanks, Ira Index: tree-vect-patterns.c === --- tree-vect-patterns.c(revision 148035) +++ tree-vect-patterns.c(working copy) @@ -515,6 +515,9 @@ vect_recog_pow_pattern (gimple last_stmt && REAL_VALUES_EQUAL (TREE_REAL_CST (exp), dconsthalf)) { tree newfn = mathfn_built_in (TREE_TYPE (base), BUILT_IN_SQRT); + if (!newfn) +return NULL; + *type_in = get_vectype_for_scalar_type (TREE_TYPE (base)); if (*type_in) { -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40348
[Bug testsuite/40359] [4.5 Regression] Revision 148211 caused a lot of failures in the vect test suite.
--- Comment #12 from irar at il dot ibm dot com 2009-06-15 09:58 --- (In reply to comment #9) > The patch in comment #8 fixes the failures reported in comment #7. I now see > (powerpc-apple-darwin9 with -m64): > FAIL: gcc.dg/vect/vect-42.c scan-tree-dump-times vect "Alignment of access > forced using versioning" 3 Is this target ([istarget *-*-darwin*] && [is-effective-target lp64]) (meaning vector_alignment_reachable is false for it)? If so, why do we do peeling? And also why in that case it doesn't XPASS "Alignment of access forced using peeling" 1 "vect"? Otherwise, vector_alignment_reachable is true, and it is not supposed to look for the versioning string at all (since the target is not vect_no_align, right?). It doesn't make sense to me either way... Revital, maybe you can try to add brackets: { ! { vector_alignment_reachable } } instead of { ! vector_alignment_reachable} ? Ira -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40359
[Bug testsuite/40359] [4.5 Regression] Revision 148211 caused a lot of failures in the vect test suite.
--- Comment #17 from irar at il dot ibm dot com 2009-06-16 07:36 --- Dominique, Could you please try this patch (I changed (!a && !b) to !(a || b)). Thanks, Ira Index: vect-42.c === --- vect-42.c (revision 148487) +++ vect-42.c (working copy) @@ -63,7 +63,7 @@ } /* { dg-final { scan-tree-dump-times "vectorized 2 loops" 1 "vect" } } */ -/* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 3 "vect" { target { vect_no_align || { { ! vector_alignment_reachable} && {!vect_hw_misalign} } } } } } */ +/* { dg-final { scan-tree-dump-times "Alignment of access forced using versioning" 3 "vect" { target { vect_no_align || { ! { vector_alignment_reachable || vect_hw_misalign } } } } } } */ /* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 4 "vect" { xfail { { vect_no_align || vect_hw_misalign } || { ! vector_alignment_reachable } } } } } */ /* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { xfail { { vect_no_align || vect_hw_misalign } || { ! vector_alignment_reachable } } } } } */ /* { dg-final { cleanup-tree-dump "vect" } } */ -- irar at il dot ibm dot com changed: What|Removed |Added ------------------------ CC||irar at il dot ibm dot com http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40359
[Bug testsuite/40359] [4.5 Regression] Revision 148211 caused a lot of failures in the vect test suite.
--- Comment #19 from irar at il dot ibm dot com 2009-06-16 10:18 --- (In reply to comment #18) > > Could you please try this patch (I changed (!a && !b) to !(a || b)). > I am currently regtesting on my ppc and it takes a long time. Meanwhile I am > not sure to understand what you expect with this change: if I am not mistaken > !(a || b) == (!a && !b) . Yes, the problem is that we think that the test is correct and it doesn't work because of some syntax/brackets/space problems. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40359
[Bug testsuite/40359] [4.5 Regression] Revision 148211 caused a lot of failures in the vect test suite.
--- Comment #21 from irar at il dot ibm dot com 2009-06-16 11:08 --- (In reply to comment #20) > What are the expected patterns for the 3 variables > with -m32 and -m64? I am not sure, this is why I asked you if the target is ([istarget *-*-darwin*] && [is-effective-target lp64]). vect_no_align and vect_hw_misalign have to be false, so, I guess, vector_alignment_reachable is different for -m32 and -m64, since the behaviour is different. "Alignment of access forced using versioning" means the vectorizer uses loop versioning to force alignment. It happens when there is no misalignment support at all (vect_no_align) or when other methods fail: loop peeling doesn't help (!vector_alignment_reachable) and also there is no hardware misalignment support (!vect_hw_misalign). >From the dump you attached, I see that loop peeling was done, therefore, vector_alignment_reachable is true, and it must not look for "Alignment of access forced using versioning". But it does. This what makes me think that it is just a syntax problem. On the other hand, I don't understand the difference with -m32 and -m64. It seems to me, that ([istarget *-*-darwin*] && [is-effective-target lp64]) is false for -m32 and, possibly, true for -m64. But that contradicts the dump. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40359
[Bug testsuite/40359] [4.5 Regression] Revision 148211 caused a lot of failures in the vect test suite.
--- Comment #23 from irar at il dot ibm dot com 2009-06-17 08:22 --- (In reply to comment #22) > My understanding is that ([istarget *-*-darwin*] && [is-effective-target > lp64]) > should return false for -m32 and true for -m64. At least it is how it works on > other tests I have looked at. Is there anyway to check it? You can add /* { dg-final { scan-tree-dump-times "bla bla bla" 1 "vect" { target vector_alignment_reachable } } } */ to some test. It should fail for -m32 and pass for -m64 (since we think that vector_alignment_reachable is true for -m32 and false for -m64). -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40359
[Bug testsuite/40359] [4.5 Regression] Revision 148211 caused a lot of failures in the vect test suite.
--- Comment #25 from irar at il dot ibm dot com 2009-06-17 11:06 --- (In reply to comment #24) > If I add to vect-42.c (with my patch) the line > > /* { dg-final { scan-tree-dump-times "bla bla bla" 1 "vect" { target vector_alignment_reachable } } } */ ... > i.e., the test is done for -m32 (and fail) but not for -m64. So, vector_alignment_reachable is true for -m32 and false for -m64. ... > i.e., vect_hw_misalign is false for both -m32 and -m64. > So it looks that vect_hw_misalign has the opposite meaning of that assumed in > comment #16: > > hmmm... versioning should not be done for targets that support > > vect_hw_misalign... Why? vect_hw_misalign means that misaligned data acceses are supported by hardware, therefore, we don't need to do versioning. And we expect versioning here with -m64 since both vect_hw_misalign and vector_alignment_reachable are false. > Final note, the change in comment #17 does not help. Thanks for checking. I still don't understand why this test works on -m64 /* { dg-final { scan-tree-dump-times "Alignment of access forced using peeling" 1 "vect" { xfail { { vect_no_align || vect_hw_misalign } || { ! vector_alignment_reachable } } } } } */ vector_alignment_reachable is false, so there should be no peeling according to the test. But it is there, and the test doesn't XPASS... And, of course, I don't understand why we do peeling, i.e., builtin vector_alignment_reachable returns true. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40359
[Bug testsuite/40359] [4.5 Regression] Revision 148211 caused a lot of failures in the vect test suite.
--- Comment #29 from irar at il dot ibm dot com 2009-06-17 12:40 --- Oh, so the first dump you attached (in comment #11) was for -m32. Now it makes sense. I think, we have to distinguish between vect_no_align and the other cases. I will prepare a patch tomorrow. Thanks, Ira -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40359
[Bug middle-end/40475] [4.5 Regression] gcc.dg/vect/vect-nest-cycle-[12].c
--- Comment #1 from irar at il dot ibm dot com 2009-06-17 12:46 --- Could you please attach a vectorizer dump for one of them? I need to know what prevented vectorization. Thanks, Ira -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40475
[Bug middle-end/40475] [4.5 Regression] gcc.dg/vect/vect-nest-cycle-[12].c
--- Comment #4 from irar at il dot ibm dot com 2009-06-18 07:17 --- Created an attachment (id=18017) --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=18017&action=view) patch to fix the tests Thanks. It's misalignment. Could you please check the attached patch? -- irar at il dot ibm dot com changed: What|Removed |Added AssignedTo|unassigned at gcc dot gnu |irar at il dot ibm dot com |dot org | Status|UNCONFIRMED |ASSIGNED http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40475
[Bug testsuite/40359] [4.5 Regression] Revision 148211 caused a lot of failures in the vect test suite.
--- Comment #31 from irar at il dot ibm dot com 2009-06-18 08:03 --- Created an attachment (id=18019) --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=18019&action=view) patch to fix vect-42.c I think the easiest way to fix it is to change the test to have one vetorizable loop again as before http://gcc.gnu.org/viewcvs?view=rev&revision=147851. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40359
[Bug testsuite/40359] [4.5 Regression] Revision 148211 caused a lot of failures in the vect test suite.
--- Comment #33 from irar at il dot ibm dot com 2009-06-18 09:14 --- Created an attachment (id=18020) --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=18020&action=view) fix vect-42.c OK, now I understand why we need two loops here (we need to pass the arrays as parameters to avoid versioning for alias). So, I split the checks for vect_no_align and the others. Hope, this time it works. Thanks. -- irar at il dot ibm dot com changed: What|Removed |Added Attachment #18019|0 |1 is obsolete|| http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40359
[Bug testsuite/40475] [4.5 Regression] gcc.dg/vect/vect-nest-cycle-[12].c
--- Comment #7 from irar at il dot ibm dot com 2009-06-21 07:32 --- Fixed. -- irar at il dot ibm dot com changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution||FIXED http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40475
[Bug tree-optimization/40542] [4.3/4.4/4.5 Regression] vectorizes access to volatile array
--- Comment #2 from irar at il dot ibm dot com 2009-06-28 10:57 --- So, the solution is to prevent vectorization of volatile types, like in the patch below? Index: tree-vect-data-refs.c === --- tree-vect-data-refs.c (revision 149023) +++ tree-vect-data-refs.c (working copy) @@ -1896,6 +1896,14 @@ vect_analyze_data_refs (loop_vec_info lo return false; } + if (TYPE_VOLATILE (TREE_TYPE (DR_REF (dr +{ + if (vect_print_dump_info (REPORT_UNVECTORIZED_LOCATIONS)) +fprintf (vect_dump, "not vectorized: memory access of volatile " +"type"); + return false; +} + stmt = DR_STMT (dr); stmt_info = vinfo_for_stmt (stmt); -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40542
[Bug tree-optimization/40542] [4.3/4.4/4.5 Regression] vectorizes access to volatile array
--- Comment #7 from irar at il dot ibm dot com 2009-06-30 12:02 --- Fixed. -- irar at il dot ibm dot com changed: What|Removed |Added Status|NEW |RESOLVED Resolution||FIXED http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40542
[Bug fortran/31067] MINLOC should sometimes be inlined (gas_dyn is sooooo sloooow)
--- Comment #27 from irar at il dot ibm dot com 2009-07-05 06:48 --- (In reply to comment #23) > because there are two reductions in that loop which I think the vectorizer > cannot handle: Actually, the vectorizer can vectorize two reductions. I think, the problem is in cond_expr in reduction: > pos.0_3 = [cond_expr] D.1599_29 ? pos.0_32 : pos.0_31; > limit.2_5 = [cond_expr] D.1599_29 ? limit.2_22 : limit.2_8; I'll look into it. Ira -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31067
[Bug tree-optimization/40770] Vectorization of complex types, vectorization of sincos missing
--- Comment #2 from irar at il dot ibm dot com 2009-07-16 12:29 --- pr40770.c:20: note: ==> examining statement: sincostmp.21_1 = __builtin_cexpi (D.1625_3); pr40770.c:20: note: get vectype for scalar type: complex double pr40770.c:20: note: not vectorized: unsupported data-type complex double make_vector_type returns NULL for this type. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40770
[Bug tree-optimization/40770] Vectorization of complex types, vectorization of sincos missing
--- Comment #6 from irar at il dot ibm dot com 2009-07-16 17:31 --- (In reply to comment #3) > > make_vector_type returns NULL for this type. > Yes - there is no vector type for complex double. But the vectorizer > could query for a vector type for the complex component type (double) > and divide the vector element count by 2 (for complex) to get the > vectorization factor which would be 1 here. I see. > Should SLP the be possible > for that loop? Not with the current implementation - SLP needs strided stores to start. Here the stores are not even adjacent. I think, it would be better to vectorize this loop with regular loop-based vectorization to avoid permutations. I'll take a better look on Sunday. Ira > Richard. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40770
[Bug tree-optimization/40801] internal compiler error: in vect_get_vec_def_for_stmt_copy, at tree-vect-stmts.c:1096
--- Comment #3 from irar at il dot ibm dot com 2009-07-19 09:35 --- Testing a fix. Ira -- irar at il dot ibm dot com changed: What|Removed |Added AssignedTo|unassigned at gcc dot gnu |irar at il dot ibm dot com |dot org | Status|NEW |ASSIGNED Last reconfirmed|2009-07-18 19:15:43 |2009-07-19 09:35:55 date|| http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40801
[Bug tree-optimization/40770] Vectorization of complex types, vectorization of sincos missing
--- Comment #7 from irar at il dot ibm dot com 2009-07-20 11:18 --- AFAIU, querying for the component type of complex type is not difficult to implement. I think, that loop-based vectorization is preferable here, so we should stay with vectorization factor of 2 for doubles. The next problem is to vectorize D.1611_4 = IMAGPART_EXPR ; and D.1612_6 = REALPART_EXPR ; Currently, we support only loads and stores with IMAGPART/REALPART_EXPR, vectorizing them as strided accesses, with extract odd and even operations for loads. So, we will have to support interleaving of non-memory variables. Does __builtin_cexpi have a vector implementation? If so, does it return two vectors? If not, I guess, we need something like: sincostmp.1 = __builtin_cexpi (xd[i]); sincostmp.2 = __builtin_cexpi (xd[i+1]); v1 = VEC_EXTRACT_EVEN (sincostmp.1, sincostmp.2); v2 = VEC_EXTRACT_ODD (sincostmp.1, sincostmp.2); sf[i:i+1] = v1; cf[i:i+1] = v2; i = i + 2; Or we can use the two vectors from vectorized __builtin_cexpi as parameters of extract operations. Does that make sense? Ira -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40770
[Bug fortran/31067] MINLOC should sometimes be inlined (gas_dyn is sooooo sloooow)
--- Comment #28 from irar at il dot ibm dot com 2009-07-20 12:03 --- I've just committed a patch that adds support of cond_expr in reductions in nested cycles (http://gcc.gnu.org/ml/gcc-patches/2009-07/msg01124.html). cond_expr cannot be vectorized in reduction of inner-most loop, because such reduction changes the order of computation, and that cannot be done for cond_expr. Ira -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31067
[Bug tree-optimization/40801] internal compiler error: in vect_get_vec_def_for_stmt_copy, at tree-vect-stmts.c:1096
--- Comment #5 from irar at il dot ibm dot com 2009-07-26 07:04 --- Fixed. -- irar at il dot ibm dot com changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution||FIXED http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40801
[Bug fortran/31067] MINLOC should sometimes be inlined (gas_dyn is sooooo sloooow)
--- Comment #32 from irar at il dot ibm dot com 2009-07-26 07:48 --- (In reply to comment #30) > Regarding the just committed inline version: It would be interesting to know > whether it is vectorizable (with/without -ffinite-math-only [i.e. > -ffast-math]). It depends on where it is inlined. It has to be vectorized in outer loop (see my previous comment), so it needs another loop around it. Ira -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31067
[Bug fortran/31067] MINLOC should sometimes be inlined (gas_dyn is sooooo sloooow)
--- Comment #34 from irar at il dot ibm dot com 2009-07-27 08:36 --- (In reply to comment #33) > Using the example from comment 23 with ... > gfortran shows: test.f90:12: note: not vectorized: unsupported use in stmt. > and needs 2.272s. (By comparison. 4.4 needs 3.688s.) This is for the inner loop vectorization. For the outer loop we get: tmp.f90:11: note: not vectorized: control flow in loop. because of the if's. Maybe loop unswitching can help us. Vectorizable outer-loops look like this: (pre-header) | header <---+ | | inner-loop | | | tail --+ | (exit-bb) Does ifort vectorize the exact same implemantion of minloc? Ira -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31067
[Bug fortran/31067] MINLOC should sometimes be inlined (gas_dyn is sooooo sloooow)
--- Comment #38 from irar at il dot ibm dot com 2009-07-27 12:44 --- I am not sure that that kind of computation can be generated automatically, since in general the order of caclulation of cond_expr cannot be changed. However, the loop can be split: for (i = 0; i < end; i++) if (arr[i] < limit) limit = arr[i]; for (i = 0; i < end; i++) if (arr[i] == limit) { pos = i + 1; break; } making the first loop vectorizable (inner-most loop vectorization). Ira -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31067
[Bug fortran/31067] MINLOC should sometimes be inlined (gas_dyn is sooooo sloooow)
--- Comment #41 from irar at il dot ibm dot com 2009-07-28 08:12 --- That requires pattern recognition. MIN/MAX_EXPR are recognized by the first phiopt pass, so MIN/MAXLOC should be either also recognized there or in the vectorizer. (The phiopt pass transforms if clause to MIN/MAX_EXPR. The vectorizer gets COND_EXPR after if-conversion pass). -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31067
[Bug middle-end/37150] vectorizer misses some loops
--- Comment #10 from irar at il dot ibm dot com 2009-08-06 10:49 --- Yes. The problem is that only a basic implementation was added. To vectorize this code several improvements must be done: support stmt group sizes greater than vector size, allow loads and stores to the same location, initiate SLP analysis from groups of loads, support misaligned access, etc. Finding a benchmark could really help to push these items to the top of vectorizer's todo list. Ira -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37150
[Bug tree-optimization/41008] [4.5 Regression] ICE in vect_is_simple_reduction, at tree-vect-loop.c:1708
--- Comment #3 from irar at il dot ibm dot com 2009-08-09 12:15 --- Fixed. -- irar at il dot ibm dot com changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution||FIXED http://gcc.gnu.org/bugzilla/show_bug.cgi?id=41008
[Bug tree-optimization/41019] Variate_generator with mt19937 and normal_distribution produces wrong sequence for "-O3".
--- Comment #6 from irar at il dot ibm dot com 2009-08-12 12:14 --- Looks like a problem in data-ref analysis: Creating dr for this_6(D)->_M_x[__k_87] ... base_address: this_6(D) offset from base address: 0 constant offset from base address: 0 step: 8 aligned to: 128 base_object: this_6(D)->_M_x[0] And the vectorizer creates accesses relatively to this_6(D) (base_address above) with zero offset (instead of this_6(D)->_M_x[0] or with an offset of _M_x). Ira -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=41019
[Bug tree-optimization/41019] Variate_generator with mt19937 and normal_distribution produces wrong sequence for "-O3".
--- Comment #8 from irar at il dot ibm dot com 2009-08-13 05:40 --- (In reply to comment #7) > Oh. Did you manage to reduce or reproduce with a smaller testcase? No, I just looked at the vectorized loops. The guilty one is bin/../lib/gcc/x86_64-unknown-linux-gnu/4.5.0/../../../../include/c++/4.5.0/tr1/random.tcc:231 Ira -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=41019
[Bug tree-optimization/41019] Variate_generator with mt19937 and normal_distribution produces wrong sequence for "-O3".
--- Comment #10 from irar at il dot ibm dot com 2009-08-13 11:34 --- Reduced testcase: #include #include #define N 4 long int a[N]; int main () { int k; for (k = 0; k < N; ++k) a[k] = a[k] != 5 ? 12 : 10; for (k = 0; k < N; ++k) printf ("%u ", a[k]); printf ("\n"); return 0; } %gcc -O3 t.c % ./a.out 0 0 0 0 %gcc -O2 t.c % ./a.out 12 12 12 12 If the type of 'a' is int, there is no problem. The vectorizer produces almost the same code in both cases (except for number of iterations and types). I am attaching the assembly for int and long int versions. Ira -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=41019
[Bug tree-optimization/41019] Variate_generator with mt19937 and normal_distribution produces wrong sequence for "-O3".
--- Comment #11 from irar at il dot ibm dot com 2009-08-13 11:36 --- Created an attachment (id=18350) --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=18350&action=view) The assembly for the long int version (wrong code) -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=41019
[Bug tree-optimization/41019] Variate_generator with mt19937 and normal_distribution produces wrong sequence for "-O3".
--- Comment #12 from irar at il dot ibm dot com 2009-08-13 11:37 --- Created an attachment (id=18351) --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=18351&action=view) The assembly for the int version (correct) -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=41019
[Bug tree-optimization/25211] [4.1/4.2 Regression] verify_ssa ICE for mesa with -Os -ftree-loop-linear
--- Comment #4 from irar at il dot ibm dot com 2005-12-14 13:11 --- I think the reason why this ICE occurs with my patch (http://gcc.gnu.org/viewcvs?view=rev&rev=102356) is that my patch enables data-refs analysis for INDIRECT_REFs. Similar ICE in PR 20256 happens also before my patch since the data-refs there are ARRAY_REFs, and ARRAY_REFs were already supported before. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=25211
[Bug tree-optimization/25371] -ftree-vectorize results in internal compiler error on AMD64
--- Comment #3 from irar at il dot ibm dot com 2005-12-18 08:15 --- I failed to reproduce this ICE on ppc and i686. Vectorizer's dump file can help. -- irar at il dot ibm dot com changed: What|Removed |Added CC||irar at il dot ibm dot com http://gcc.gnu.org/bugzilla/show_bug.cgi?id=25371
[Bug tree-optimization/21591] not vectorizing a loop with access to structs
--- Comment #7 from irar at il dot ibm dot com 2006-09-13 08:32 --- I think, the problem here is that we only check SMT and not NMT. I am preparing a patch to fix this. NMT is stored in ptr_info_def of data-ref, and only if it does not exist, SMT will be checked. -- irar at il dot ibm dot com changed: What|Removed |Added CC||irar at il dot ibm dot com, ||dnovillo at redhat dot com AssignedTo|unassigned at gcc dot gnu |irar at il dot ibm dot com |dot org | Status|NEW |ASSIGNED Last reconfirmed|2006-02-21 01:04:59 |2006-09-13 08:32:31 date|| http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21591
[Bug tree-optimization/18438] vectorizer failed for vector matrix multiplication
--- Comment #3 from irar at il dot ibm dot com 2006-09-19 07:10 --- > t.c:20: note: not vectorized: mixed data-types > t.c:20: note: can't determine vectorization factor. > > Removing flags[i] = true; Multiple data-types vectorization is already supported in the autovect branch, and the patches for mainline (starting from http://gcc.gnu.org/ml/gcc-patches/2006-02/msg00941.html) will be committed as soon as 4.3 is open. > we get: > t.c:20: note: not consecutive access > t.c:20: note: not vectorized: complicated access pattern. Vectorization of strided accesses is also already implemented in the autovect branch (and will be committed to the mainline 4.3). However, this case contains stores with gaps (stores to opoints[i][0], opoints[i][1], and opoints[i][2], without a store to opoints[i][3]), and only loads with gaps are currently supported. Therefore, this loop will be vectorizable in the autovect branch (and soon in the mainline 4.3) if a store to opoints[i][3] is added. Ira -- irar at il dot ibm dot com changed: What|Removed |Added CC| |irar at il dot ibm dot com Last reconfirmed|2005-12-21 03:49:03 |2006-09-19 07:10:15 date|| http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18438
[Bug tree-optimization/19049] not vectorizing a fortran loop
--- Comment #7 from irar at il dot ibm dot com 2006-09-19 07:29 --- Even though vectorization of strided accesses is already implemented in the autovect branch (and will be committed to the mainline 4.3), this case contains a store with a gap (store to a[i] without a store to a[i-1]), and such stores are not supported (the current implementation supports only loads with gaps). Note, however, that adding a store to a[i-1] will create a data dependence in the loop. Ira -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19049
[Bug tree-optimization/26969] [4.1 Regression] ICE with -O1 -funswitch-loops -ftree-vectorize
--- Comment #15 from irar at il dot ibm dot com 2006-10-18 11:03 --- (In reply to comment #13) > We need to check if above patch fixes PR26969 as well. Checked, it does not. -- irar at il dot ibm dot com changed: What|Removed |Added CC| |irar at il dot ibm dot com http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26969
[Bug tree-optimization/26362] ICE on the autovect-branch (gfortran example)
--- Comment #3 from irar at il dot ibm dot com 2007-01-28 10:45 --- The current versions of both mainline and autovect branch do not ICE. Strided loads are not implemented for SSE. I opened a PR 30211 for it. I think this PR can be closed. Ira -- irar at il dot ibm dot com changed: What|Removed |Added CC||irar at il dot ibm dot com http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26362
[Bug tree-optimization/27659] ICE on autovect-branch
--- Comment #3 from irar at il dot ibm dot com 2007-01-28 11:38 --- I tried to reproduce this on x86 with current autovect branch and mainline with .../g++ -fpreprocessed tmp.ii -S -O3 -ftree-vectorize -msse2 -ansi -fdump-tree-vect-details. It doesn't not ICE, and the loop is vectorized. Ira -- irar at il dot ibm dot com changed: What|Removed |Added CC||irar at il dot ibm dot com http://gcc.gnu.org/bugzilla/show_bug.cgi?id=27659
[Bug c/30843] [4.3 Regression] ice for legal code with -ftree-vectorize -O2
--- Comment #5 from irar at il dot ibm dot com 2007-02-19 11:18 --- Subject: Re: ice for legal code with -ftree-vectorize -O2 I know what the problem is. If we don't remove the store while iterating, we can't get it later (the si), can we? Ira "dorit at il dot ibm dot com" <[EMAIL PROTECTED] To .gnu.org> Ira Rosen/Haifa/[EMAIL PROTECTED] cc 18/02/2007 23:52 Subject [Bug c/30843] ice for legal code Please respond to with -ftree-vectorize -O2 [EMAIL PROTECTED] gnu.org -- dorit at il dot ibm dot com changed: What|Removed |Added CC| |irar at il dot ibm dot com http://gcc.gnu.org/bugzilla/show_bug.cgi?id=30843 --- You are receiving this mail because: --- You are on the CC list for the bug, or are watching someone who is. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=30843
[Bug c/30843] [4.3 Regression] ice for legal code with -ftree-vectorize -O2
--- Comment #6 from irar at il dot ibm dot com 2007-02-19 12:41 --- Sorry about the last comment, it was sent by mistake. Ira -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=30843
[Bug bootstrap/30921] New: Bootstrap failure with -ftree-vectorize on i386
Bootstrap with vectorization enabled fails on i386 starting from revision 121767: http://gcc.gnu.org/viewcvs?view=rev&revision=121767 Ira -- Summary: Bootstrap failure with -ftree-vectorize on i386 Product: gcc Version: 4.3.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: bootstrap AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: irar at il dot ibm dot com GCC build triplet: i386-redhat-linux GCC host triplet: i386-redhat-linux GCC target triplet: i386-redhat-linux http://gcc.gnu.org/bugzilla/show_bug.cgi?id=30921
[Bug bootstrap/30921] Bootstrap failure with -ftree-vectorize on i386
--- Comment #1 from irar at il dot ibm dot com 2007-02-22 07:58 --- Here is the ChangeLog entry for that patch: 2007-02-09 Richard Henderson <[EMAIL PROTECTED]> * config/i386/constraints.md (Ym): New constraint. * config/i386/i386.md (movsi_1): Change Y2 to Yi constraints. (movdi_1_rex64): Split sse and xmm general register moves from memory move alternatives. Use conditional register constraints. (movsf_1, movdf_integer): Likewise. (zero_extendsidi2_32, zero_extendsidi2_rex64): Likewise. (movdf_integer_rex64): New. (pushsf_rex64): Fix output constraints. * config/i386/sse.md (sse2_loadld): Split rm alternative, use Yi. (sse2_stored): Likewise. (sse2_storeq_rex64): New. * config/i386/i386.c (x86_inter_unit_moves): Enable for not amd and not generic. (ix86_secondary_memory_needed): Don't bypass TARGET_INTER_UNIT_MOVES for optimize_size. Remove SF/DFmode hack. Ira -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=30921
[Bug bootstrap/30921] Bootstrap failure with -ftree-vectorize on i386
--- Comment #3 from irar at il dot ibm dot com 2007-02-22 08:22 --- (In reply to comment #2) > (In reply to comment #0) > > Bootstrap with vectorization enabled fails on i386 starting from revision > > 121767: > > http://gcc.gnu.org/viewcvs?view=rev&revision=121767 > Could you post exact steps how to reproduce this failure? Run make bootstrap BOOT_CFLAGS="-O2 -g -ftree-vectorize -msse2" Ira -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=30921
[Bug tree-optimization/24309] [4.1/4.2/4.3 Regression] ICE with -O3 -ftree-loop-linear
--- Comment #15 from irar at il dot ibm dot com 2007-03-05 09:30 --- I tried the reduced testcase on powerpc with -ftree-loop-linear and both -O2 and -O3 on 4.1, 4.2 and 4.3, and it works fine. Ira -- irar at il dot ibm dot com changed: What|Removed |Added CC||irar at il dot ibm dot com http://gcc.gnu.org/bugzilla/show_bug.cgi?id=24309
[Bug tree-optimization/25371] -ftree-vectorize results in internal compiler error on AMD64
--- Comment #6 from irar at il dot ibm dot com 2007-03-11 10:33 --- Harsha, could you please attach vectorizer's dump file (produced with -fdump-tree-vect-details)? Thanks, Ira -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=25371
[Bug tree-optimization/31343] New: ICE in data-refs dependence testing
An attempt to divide by zero is made (causing ICE on the attached test case) for evolution functions with zero step. For the following evolution functions of pS[i_15].x and pS[i_15].y from the attached test (chrec_a = {{0, +, 1}_1, +, 0}_2) (chrec_b = {{1, +, 1}_1, +, 0}_2) the difference (-1) is calculated, and then the check whether the step (0)divides the difference is performed in function chrec_steps_divide_constant_p (tree-data-ref.c), causing ICE. -- Summary: ICE in data-refs dependence testing Product: gcc Version: 4.3.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: irar at il dot ibm dot com http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31343
[Bug tree-optimization/31343] ICE in data-refs dependence testing
--- Comment #1 from irar at il dot ibm dot com 2007-03-25 10:02 --- Created an attachment (id=13281) --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=13281&action=view) test case -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31343
[Bug tree-optimization/32806] New: Missing optimization to remove backword dependencies
for (i=0; ihttp://gcc.gnu.org/bugzilla/show_bug.cgi?id=32806
[Bug bootstrap/33031] Bootstrap fails on gcc/tree.c
--- Comment #1 from irar at il dot ibm dot com 2007-08-09 08:44 --- I got this too on x86_64-linux. I guess the guilty patch is r127306 | chaoyingfu | 2007-08-09 01:29:12 +0300 (Thu, 09 Aug 2007) | 213 lines since it added the function fixed_zerop: * tree.c ... (fixed_zerop): New function. Ira -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33031
[Bug tree-optimization/33447] New: Non-empty latch block prevents loop vectorization
The following loop (from linpk.f90) contains a non-empty latch block before tree optimizations: Source code: Line m = MOD(N,4) 323IF ( m.NE.0 ) THEN 324 DO i = 1 , m 325 Dy(i) = Dy(i) + Da*Dx(i) 326 ENDDO 327 IF ( N.LT.4 ) RETURN 328ENDIF 329mp1 = m + 1 330DO i = mp1 , N , 4 331 Dy(i) = Dy(i) + Da*Dx(i) 332 Dy(i+1) = Dy(i+1) + Da*Dx(i+1) 333 Dy(i+2) = Dy(i+2) + Da*Dx(i+2) 334 Dy(i+3) = Dy(i+3) + Da*Dx(i+3) 335ENDDO The first SSA dump: : ... if (countm1.32_8 == 0) goto ; else goto ; : countm1.32_98 = countm1.32_8 + 4294967295; goto ; This is also related to PR 28643 and PR 33244. However, in these PRs some tree optimization puts stmts/phi nodes in the latch block, while in the lnpck example the latch block is non-empty to begin with. -- Summary: Non-empty latch block prevents loop vectorization Product: gcc Version: 4.3.0 Status: UNCONFIRMED Keywords: missed-optimization Severity: enhancement Priority: P3 Component: tree-optimization AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: irar at il dot ibm dot com http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33447
[Bug middle-end/33449] [4.3 regression] ICE for fortran code with -O2 -ftree-vectorize
--- Comment #4 from irar at il dot ibm dot com 2007-09-17 08:59 --- (In reply to comment #3) > I can reproduce that on x86_64-linux with trunk rev. 128442. Dorit's fix is revision 128514, so it is not supposed to work on 128442... Anyway, I am trying to reproduce this ICE on x86_64-linux now, with the current trunk (128538). Ira -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33449
[Bug middle-end/33449] [4.3 regression] ICE for fortran code with -O2 -ftree-vectorize
--- Comment #5 from irar at il dot ibm dot com 2007-09-17 09:54 --- (In reply to comment #4) > Anyway, I am trying to reproduce this ICE on x86_64-linux now, with the > current > trunk (128538). It doesn't ICE for me. (The loop gets vectorized). Ira -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33449
[Bug target/33505] Vectorizer (or spu target builtins) and PCH don't get along
--- Comment #1 from irar at il dot ibm dot com 2007-09-30 09:42 --- I managed to reproduce it. Here http://gcc.gnu.org/ml/gcc-patches/2007-09/msg01559.html Richard suggested to add a GTY(()) to struct spu_builtin_description spu_builtins[] = { #define DEF_BUILTIN(fcode, icode, name, type, params) \ {fcode, icode, name, type, params, NULL_TREE}, #include "spu-builtins.def" #undef DEF_BUILTIN }; Actually there is a GTY(()) in spu-builtins.h extern GTY(()) struct spu_builtin_description spu_builtins[]; But anyway I tried to the following and it didn't help: Index: spu.c === --- spu.c (revision 128708) +++ spu.c (working copy) @@ -4459,7 +4459,7 @@ ^L /* Create the built-in types and functions */ -struct spu_builtin_description spu_builtins[] = { +struct spu_builtin_description GTY (()) spu_builtins[] = { #define DEF_BUILTIN(fcode, icode, name, type, params) \ {fcode, icode, name, type, params, NULL_TREE}, #include "spu-builtins.def" Ira -- irar at il dot ibm dot com changed: What|Removed |Added CC| |irar at il dot ibm dot com, ||richard dot guenther at ||gmail dot com Status|UNCONFIRMED |NEW Ever Confirmed|0 |1 Last reconfirmed|-00-00 00:00:00 |2007-09-30 09:42:56 date|| http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33505
[Bug middle-end/33597] Internal compiler error while compiling libswcale from ffmpeg
--- Comment #6 from irar at il dot ibm dot com 2007-09-30 10:37 --- (In reply to comment #5) > Patch in testing: Thanks for fixing this! (I've just started to test the exact same patch :)) Ira -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33597
[Bug target/33505] Vectorizer (or spu target builtins) and PCH don't get along
--- Comment #3 from irar at il dot ibm dot com 2007-10-02 09:22 --- (In reply to comment #2) > This is kinda on my list of stuff to forward port from the internal PS3 > toolchain. Maybe I can help with testing this patch for mainline? Thanks, Ira -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33505
[Bug tree-optimization/33680] [4.3 Regression] ICE when compilling elbg.c from ffmpeg (vectorizer)
--- Comment #7 from irar at il dot ibm dot com 2007-10-07 12:31 --- (In reply to comment #3) I get: pr33680.c: In function גfג: pr33680.c:1: error: expected an SSA_NAME object pr33680.c:1: error: in statement D.1618_93 = D.1556 /[ex] 4; pr33680.c:1: internal compiler error: verify_ssa failed The problem is that D.1556 is a VAR_DECL and not an SSA_NAME. This stmt is created while gimplifying data-ref base in vect_create_addr_base_for_vector_ref(). The expr is (int[0:D.1553] *) newcentroid.1_22 + (long unsigned int) dim_4(D) * 8 sizes-gimplified type_1 BLK size unit size ... D.1618 = D.1556 /[ex] 4 is created, taking D.1556 as the unit size in gimplify_compound_lval. And later D.1618 is replaced with an SSA_NAME D.1618_93, since it's a lhs (in gimplify_modify_expr). > Vectorizer produces invalid Gimple SSA code: > > D.1769_169 = D.1599 /[ex] 4; > > D.1599 should be renamed. > Where should it be renamed? In gimplify_smth? Thanks, Ira -- irar at il dot ibm dot com changed: What|Removed |Added CC| |irar at il dot ibm dot com http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33680
[Bug tree-optimization/33680] [4.3 Regression] ICE when compilling elbg.c from ffmpeg (vectorizer)
--- Comment #9 from irar at il dot ibm dot com 2007-10-09 12:49 --- (In reply to comment #8) > If you use force_gimple_operand_bsi, it takes care of that itself. Thanks! I will try to see if we can use it. The problem is we don't have a bsi, we insert those stmts using bsi_insert_on_edge_immediate on loop_preheader_edge. > If you e.g. use force_gimple_operand instead, you need to take care of > calling mark_symbols_for_renaming yourself. In order to do this, we will have to go through the statement list created by force_gimple_operand, and I am not sure that it's a good idea. Thanks, Ira -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33680
[Bug tree-optimization/33680] [4.3 Regression] ICE when compilling elbg.c from ffmpeg (vectorizer)
--- Comment #11 from irar at il dot ibm dot com 2007-10-10 13:23 --- I understand that those symbols have to be renamed, I am just saying that maybe it should be done in the gimplifier and not in the vectorizer. But since force_gimple_operand_bsi also goes through the statements list, I guess it is reasonable to do the same thing in the vectorizer. Or we can add a new API like force_gimple_operand_and_mark_for_renaming. Anyway, I tried your patch. Now we get a different ICE: internal compiler error: in referenced_var_lookup, at tree-dfa.c:642 D.1556 is marked for renaming but then during update_ssa it cannot find it - htab_find_with_hash (tree-dfa.c:641) returns NULL. #0 referenced_var_lookup (uid=1556) at ../../gcc/gcc/tree-dfa.c:642 #1 0x006f9308 in update_ssa (update_flags=2048) at ../../gcc/gcc/tree-into-ssa.c:3207 #2 0x00aac184 in vect_transform_loop (loop_vinfo=0xe94410) at ../../gcc/gcc/tree-vect-transform.c:7431 #3 0x007fae09 in vectorize_loops () at ../../gcc/gcc/tree-vectorizer.c:2507 #4 0x00631726 in execute_one_pass (pass=0xdfc0c0) at ../../gcc/gcc/passes.c:1116 #5 0x006318ec in execute_pass_list (pass=0xdfc0c0) at ../../gcc/gcc/passes.c:1169 #6 0x006318fe in execute_pass_list (pass=0xdfbee0) at ../../gcc/gcc/passes.c:1170 #7 0x006318fe in execute_pass_list (pass=0xdfb2e0) at ../../gcc/gcc/passes.c:1170 #8 0x007086ce in tree_rest_of_compilation (fndecl=0x2ba807b05800) at ../../gcc/gcc/tree-optimize.c:404 #9 0x0088a054 in cgraph_expand_function (node=0x2ba807b05900) at ../../gcc/gcc/cgraphunit.c:1070 #10 0x0088bbe7 in cgraph_optimize () at ../../gcc/gcc/cgraphunit.c:1139 #11 0x004144fe in c_write_global_declarations () at ../../gcc/gcc/c-decl.c:8077 #12 0x006ad2e7 in toplev_main (argc=, argv=) at ../../gcc/gcc/toplev.c:1052 #13 0x2ba8077d5154 in __libc_start_main () from /lib64/libc.so.6 #14 0x00403cf9 in _start () Thanks, Ira -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33680
[Bug tree-optimization/33680] [4.3 Regression] ICE when compilling elbg.c from ffmpeg (vectorizer)
--- Comment #13 from irar at il dot ibm dot com 2007-10-11 10:43 --- Maybe we can fix DCE not to eliminate such vars? Or somehow fix split_constant_offset? The following patch changes the base from (int[0:D.1553] *) newcentroid.1_22 + (long unsigned int) dim_4(D) * 8 to (int[0:D.1553] *) D.1560_21 + (long unsigned int) dim_4(D) * 8 and, hence, there is no need in the size of newcentroid.1_22: Index: tree-data-ref.c === --- tree-data-ref.c (revision 128902) +++ tree-data-ref.c (working copy) @@ -579,8 +579,10 @@ split_constant_offset (tree exp, tree *v { split_constant_offset (def_stmt_rhs, &var0, &off0); var0 = fold_convert (type, var0); - *var = var0; - *off = off0; + split_constant_offset (var0, &var2, &off2); + *var = var2; + *off = fold_build2 (PLUS_EXPR, TREE_TYPE (off2), +off0, off2); return; } } Maybe we can check if the base is of the VLA type and then try to further split it as above (and not to vectorize if we fail)? Thanks, Ira -- irar at il dot ibm dot com changed: What|Removed |Added CC||rakdver at gcc dot gnu dot ||org Priority|P1 |P3 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33680
[Bug tree-optimization/33680] [4.3 Regression] ICE when compilling elbg.c from ffmpeg (vectorizer)
--- Comment #14 from irar at il dot ibm dot com 2007-10-11 12:34 --- BTW, without this patch http://gcc.gnu.org/ml/gcc-patches/2007-07/msg02122.html there is no ICE and the loop gets vectorized. Ira -- irar at il dot ibm dot com changed: What|Removed |Added CC||Jan dot Sjodin at amd dot ||com http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33680
[Bug tree-optimization/33680] [4.3 Regression] ICE when compilling elbg.c from ffmpeg (vectorizer)
--- Comment #16 from irar at il dot ibm dot com 2007-10-15 10:42 --- This patch fixes the ICE and doesn't cause regressions in the vectorizer testsuite: Index: tree-data-ref.c === --- tree-data-ref.c (revision 129292) +++ tree-data-ref.c (working copy) @@ -571,11 +571,16 @@ split_constant_offset (tree exp, tree *v if (TREE_CODE (def_stmt) == GIMPLE_MODIFY_STMT) { tree def_stmt_rhs = GIMPLE_STMT_OPERAND (def_stmt, 1); +tree arr = NULL_TREE; + +if (TREE_CODE (def_stmt_rhs) == ADDR_EXPR) + arr = TREE_OPERAND (def_stmt_rhs, 0); if (!TREE_SIDE_EFFECTS (def_stmt_rhs) && EXPR_P (def_stmt_rhs) && !REFERENCE_CLASS_P (def_stmt_rhs) - && !get_call_expr_in (def_stmt_rhs)) + && !get_call_expr_in (def_stmt_rhs) +&& (!arr || TREE_THIS_NOTRAP (arr))) { split_constant_offset (def_stmt_rhs, &var0, &off0); var0 = fold_convert (type, var0); This way we avoid arrays with unknown size. Ira -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33680
[Bug tree-optimization/33804] ICE in vect_transform_stmt, at tree-vect-transform.c:6131 with -ftree-vectorize
--- Comment #3 from irar at il dot ibm dot com 2007-10-18 08:33 --- It works fine for me (and the loop gets SLPed) on powerpc-64 and x86_64. Could you please run it with -fdump-tree-vect-details and attach the dump file? Ira -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33804
[Bug tree-optimization/33812] New: Vectorizer testcases fail
With current trunk (r129433) vectorizer testcases fail on powerpc64: FAIL: gcc.dg/vect/vect-64.c (internal compiler error) FAIL: gcc.dg/vect/vect-64.c (test for excess errors) WARNING: gcc.dg/vect/vect-64.c compilation failed to produce executable FAIL: gcc.dg/vect/vect-68.c (internal compiler error) FAIL: gcc.dg/vect/vect-68.c (test for excess errors) WARNING: gcc.dg/vect/vect-68.c compilation failed to produce executable FAIL: gcc.dg/vect/vect-70.c (internal compiler error) FAIL: gcc.dg/vect/vect-70.c (test for excess errors) WARNING: gcc.dg/vect/vect-70.c compilation failed to produce executable FAIL: gcc.dg/vect/no-scevccp-slp-31.c (internal compiler error) FAIL: gcc.dg/vect/no-scevccp-slp-31.c (test for excess errors) WARNING: gcc.dg/vect/no-scevccp-slp-31.c compilation failed to produce executable The tests ICE: vect-64.c: In function גmain1ג: vect-64.c:75: internal compiler error: in change_address_1, at emit-rtl.c:1888 Please submit a full bug report, with preprocessed source if appropriate. See <http://gcc.gnu.org/bugs.html> for instructions. #0 change_address_1 (memref=0xf7e44f40, mode=SImode, addr=0xf7e44f30, validate=1) at ../../gcc/gcc/emit-rtl.c:1888 #1 0x10179fd4 in validize_mem (ref=0xf7e44f40) at ../../gcc/gcc/explow.c:546 #2 0x101a88e8 in emit_move_insn (x=0xf7e38e00, y=0xf7e44f40) at ../../gcc/gcc/expr.c:3403 #3 0x105543e4 in rs6000_emit_epilogue (sibcall=0) at ../../gcc/gcc/config/rs6000/rs6000.c:16095 #4 0x28000488 in ?? () #5 0x105ec5ec in gen_epilogue () at ../../gcc/gcc/config/rs6000/rs6000.md:14476 #6 0x102242b4 in rest_of_handle_thread_prologue_and_epilogue () at ../../gcc/gcc/function.c:5298 #7 0x102a7af4 in execute_one_pass (pass=0x108f6a78) at ../../gcc/gcc/passes.c:1117 #8 0x102a7d68 in execute_pass_list (pass=0x108f6a78) at ../../gcc/gcc/passes.c:1170 #9 0x102a7d80 in execute_pass_list (pass=0x108f6ee8) at ../../gcc/gcc/passes.c:1171 #10 0x102a7d80 in execute_pass_list (pass=0x108f6eb4) at ../../gcc/gcc/passes.c:1171 #11 0x103ae68c in tree_rest_of_compilation (fndecl=0xf7dbc100) at ../../gcc/gcc/tree-optimize.c:404 #12 0x105705fc in cgraph_expand_function (node=0xf7dbc300) at ../../gcc/gcc/cgraphunit.c:1060 #13 0x105728c4 in cgraph_optimize () at ../../gcc/gcc/cgraphunit.c:1123 #14 0x10017914 in c_write_global_declarations () at ../../gcc/gcc/c-decl.c:8077 #15 0x1033fff0 in toplev_main (argc=, argv=) at ../../gcc/gcc/toplev.c:1055 #16 0x1009e370 in main (argc=0, argv=0x0) at ../../gcc/gcc/main.c:35 This doesn't happen with r129290. Ira -- Summary: Vectorizer testcases fail Product: gcc Version: 4.3.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: irar at il dot ibm dot com GCC build triplet: powerpc64-suse-linux GCC host triplet: powerpc64-suse-linux GCC target triplet: powerpc64-suse-linux http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33812
[Bug tree-optimization/33804] ICE in vect_transform_stmt, at tree-vect-transform.c:6131 with -ftree-vectorize
--- Comment #5 from irar at il dot ibm dot com 2007-10-21 08:45 --- (In reply to comment #4) > Created an attachment (id=14370) --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=14370&action=view) [edit] > Vectorization dump file > Thanks! The vectorizer fails in transformation phase in function vectorizable_operation: if (icode == CODE_FOR_nothing) { if (vect_print_dump_info (REPORT_DETAILS)) fprintf (vect_dump, "op not supported by target."); if (GET_MODE_SIZE (vec_mode) != UNITS_PER_WORD || LOOP_VINFO_VECT_FACTOR (loop_vinfo) < vect_min_worthwhile_factor (code)) return false; if (vect_print_dump_info (REPORT_DETAILS)) fprintf (vect_dump, "proceeding using word mode."); } During the analysis, we also get CODE_FOR_nothing, but also at that stage LOOP_VINFO_VECT_FACTOR (loop_vinfo) > vect_min_worthwhile_factor (code) hence we proceed using word mode. At the end of the analysis, we change the vectorization factor (divide it by 4) to perform pure SLP on the loop, so during the transformation phase, when we get to the same code again, we probably get that LOOP_VINFO_VECT_FACTOR (loop_vinfo) < vect_min_worthwhile_factor (code) and we fail. The idea was that we should not fail to vectorize during the transformation, since everything was checked during the analysis, therefore, a gcc_assert was put here. I'll have to think how to fix this problem. Thanks, Ira -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33804
[Bug rtl-optimization/33846] [4.3 Regression] ICE in trunc_int_for_mode, at explow.c:55
--- Comment #4 from irar at il dot ibm dot com 2007-10-21 11:02 --- The problem is with vector shift with scalar shift argument. For the code created by the vectorizer: vect_var_.49_103 = ~vect_var_.47_101; vect_var_.50_105 = vect_var_.49_103 >> 31; (ashiftrt:V4SI (not:V4SI (reg:V4SI 100)) (const_int 31 [0x1f])) is created. The failure is in explow.c:55 gcc_assert (SCALAR_INT_MODE_P (mode)); since MODE is V4SImode. Ira -- irar at il dot ibm dot com changed: What|Removed |Added Status|UNCONFIRMED |NEW Ever Confirmed|0 |1 Last reconfirmed|-00-00 00:00:00 |2007-10-21 11:02:08 date|| http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33846
[Bug tree-optimization/33804] ICE in vect_transform_stmt, at tree-vect-transform.c:6131 with -ftree-vectorize
--- Comment #6 from irar at il dot ibm dot com 2007-10-21 12:52 --- The solution can be just not check if the vectorization is worthwhile during the transformation. The decision whether to vectorize or not should be made during the analysis anyway. The vectorization factor can get smaller only in case that there is only SLP-kind of vectorization in the loop, and the VF is the unrolling factor needed to operate on full vectors. So the profitability of this loop vectorization doesn't change. Index: tree-vect-transform.c === --- tree-vect-transform.c (revision 129404) +++ tree-vect-transform.c (working copy) @@ -3865,18 +3865,21 @@ vectorizable_operation (tree stmt, block { if (vect_print_dump_info (REPORT_DETAILS)) fprintf (vect_dump, "op not supported by target."); + /* Check only during analysis. */ if (GET_MODE_SIZE (vec_mode) != UNITS_PER_WORD - || LOOP_VINFO_VECT_FACTOR (loop_vinfo) -< vect_min_worthwhile_factor (code)) + || (LOOP_VINFO_VECT_FACTOR (loop_vinfo) + < vect_min_worthwhile_factor (code) + && !vec_stmt)) return false; if (vect_print_dump_info (REPORT_DETAILS)) fprintf (vect_dump, "proceeding using word mode."); } - /* Worthwhile without SIMD support? */ + /* Worthwhile without SIMD support? Check only during analysis. */ if (!VECTOR_MODE_P (TYPE_MODE (vectype)) && LOOP_VINFO_VECT_FACTOR (loop_vinfo) -< vect_min_worthwhile_factor (code)) +< vect_min_worthwhile_factor (code) + && !vec_stmt) { if (vect_print_dump_info (REPORT_DETAILS)) fprintf (vect_dump, "not worthwhile without SIMD support."); Tested on vectorizer testsuite on x86-64-linux. Ira -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33804
[Bug tree-optimization/20122] Wrong code with gcc 4.0 tree-vectorizer
--- Additional Comments From irar at il dot ibm dot com 2005-02-24 13:41 --- I found the problem that causes this. I'll send the patch next week. Ira -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=20122
[Bug tree-optimization/20122] Wrong code with gcc 4.0 tree-vectorizer
-- What|Removed |Added AssignedTo|unassigned at gcc dot gnu |irar at il dot ibm dot com |dot org | Status|NEW |ASSIGNED Last reconfirmed|2005-03-02 11:42:36 |2005-03-02 12:43:57 date|| http://gcc.gnu.org/bugzilla/show_bug.cgi?id=20122
[Bug tree-optimization/20122] Wrong code with gcc 4.0 tree-vectorizer
--- Additional Comments From irar at il dot ibm dot com 2005-03-02 12:45 --- Fixed in http://gcc.gnu.org/ml/gcc-patches/2005-02/msg01788.html. Waiting for review. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=20122