Richard Guenther [EMAIL PROTECTED] wrote on 15/12/2005 14:52:27:
On 12/15/05, Dorit Naishlos [EMAIL PROTECTED] wrote:
So, in short - when can we assume that pointer types have the minimum
alignment required by their underlying type?
I think the C standard always guarantees this.
As far
Given a pointer to type T - when can we assume that the data pointed to is
naturally aligned (aligned on the size of the type T)?
The vectorizer currently works under the assumption that all data is
naturally aligned. At least one place where this may result in generation
of wrong code by
The used compile options are:
...
-ftree-vectorize
I don't know how much vectorization takes place in your code, but
vectorization would currently greedily increase code size (i.e. without
trying to estimate when it is profitable). One way to avoid some of the
code size growth during
Hello Everyone,
I am interested in knowing more about the vectorizer in GCC. Does
anyone have or know of any statistics about the percentage of loops
that can be vectorized in some benchmarks like MediaBench, SPEC2K and
so forth?
I have some old Spec2000 statistics, from around
Steven Bosscher [EMAIL PROTECTED] wrote on 11/16/2005 10:39:24 PM:
On Wednesday 16 November 2005 15:35, Dorit Naishlos wrote:
We'd like to suggest a few new tree-codes/optabs in order to express
the
extraction and merging of elements from/to vectors.
Watch out for tree code starvation
Paul Brook [EMAIL PROTECTED] wrote on 11/16/2005 05:03:47 PM:
On Wednesday 16 November 2005 14:35, Dorit Naishlos wrote:
We're going to commit to autovect-branch vectorization support for
non-unit-stride accesses.
We'd like to suggest a few new tree-codes/optabs in order to express
It looks like maybe a 64bit scalar-evolution issue - when I compile on
powerpc-linux with -m64, I also get the
vect4.f:4: note: not consecutive access
message.
This problem looks very similar to PR18403 which has been resolved a while
ago:
When compiling for 32bit, we get the following
Hi Toon,
Thanks for the testcases.
This one does get vectorized with autovect-branch:
~/autovect_cvs/bin/gfortran -O3 -ftree-vectorize -maltivec
-ftree-vectorizer-verbose=4 -S hilaram1.f90
hilaram1.f90:5: note: dependence distance = 0.
hilaram1.f90:5: note: accesses have the same
On Oct 21, 2005, at 9:19 AM, Toon Moene wrote:
L.S.,
This code:
SUBROUTINE S(A, B, N)
DIMENSION A(N), B(N)
READ*,Z,B
DO I = 1, N
A(I) = Z * B(I)
ENDDO
PRINT*,A
END
when compiled thusly:
The problem here is not
Like HIRLAM 6, this is also an aliasing problem:
hilaram5.f90:4: note: not vectorized: can't determine dependence between
com.b[D.909_22] and (*a_8)[D.909_22]
hilaram5.f90:7: note: not vectorized: unhandled data-ref
hilaram5.f90:7: note: vectorized 0 loops in function.
dorit
L.S.,
This
Andrew Pinski [EMAIL PROTECTED] wrote on 20/09/2005 18:09:20:
On Sep 20, 2005, at 3:01 AM, Dorit Naishlos wrote:
We've had the testcase below in autovect-branch for a while, testing
that
the 3 loops get vectorized. On mainline the third loop now gets
eliminated
by DCE (.t44.dce3
We've had the testcase below in autovect-branch for a while, testing that
the 3 loops get vectorized. On mainline the third loop now gets eliminated
by DCE (.t44.dce3). Not sure I understand why... isn't the print loop
enough to keep it alive?
==
subroutine foo(a,b)
real
Planned vectorization enhancements for 4.2:
1. Recognize reduction patterns (Dorit).
Some computations have specialized target support and can be
vectorized more efficiently if the computation idiom is recognized and
vectorized as a whole. This is especially true to idioms that involve
Daniel Berlin [EMAIL PROTECTED] wrote on 12/08/2005 17:56:11:
comments/ideas?
I would start by figuring out why update_ssa + rewrite_into_loop_closed
isn't putting SFT.3 into loop closed ssa form.
Even if we do put virtual vars back into loop closed, that's still a
bug.
I found the
Hi,
PR22543 has a testcase in which we fail with:
error: definition in block 1 does not dominate use in block 3
for SSA_NAME: SFT.3_39 in statement:
# VUSE SFT.3_39;
lsm_tmp.35_36 = D.2625.j;
In this testcase block 3 is a loop exit block, and block 1 is a loop header
block. During
...
The problem seems to be that analyze_offset_expr calls the scev
analyzer explicitely asking for recomputation (third parameter is
true):
...
Why should we start the analysis from scratch in this case? The same
question could be asked for all the uses of analyze_scalar_evolution
Richard Henderson [EMAIL PROTECTED] wrote on 20/06/2005 01:13:11:
On Sun, Jun 19, 2005 at 11:46:52PM +0300, Dorit Naishlos wrote:
The thought was to supply an API that would let the vectorizer ask for
the
minimal capability it needs - if all we need is a vector shift of a
constant value
why??
The problem is that in 'expand_vector_operations_1()' in
tree-vect-generic.c we call 'optab_for_tree_code()' to get an optab for
VEC_RSHIFT_EXPR; 'optab_for_tree_code' does not have a case for
VEC_RSHIFT_EXPR, so the vector-lowering function concludes that this
tree-code is not
Richard Henderson [EMAIL PROTECTED] wrote on 19/06/2005 19:49:46:
On Sun, Jun 19, 2005 at 07:36:15PM +0300, Dorit Naishlos wrote:
... because at least for the vector-shift case I need to
check that the shift operand is constant, and only then return
optab_shri/shli.
This isn't true
Richard Henderson [EMAIL PROTECTED] wrote on 19/06/2005 20:33:02:
On Sun, Jun 19, 2005 at 08:00:22PM +0300, Dorit Naishlos wrote:
Altivec does support non immediate shift amount (even if less
efficiently -
I have to put the shift amount in a vector register first). But since
we
have
Devang Patel [EMAIL PROTECTED] wrote on 14/06/2005 00:24:27:
On Jun 10, 2005, at 2:01 PM, Dorit Naishlos wrote:
Devang, is vect-dv-2.c a duplicate of vect-ifcvt-1.c or are they
both there
on purpose?
It is duplicate. I'll remove vect-dv-2.c tomorrow, unless I hear
otherwise
Giovanni Bajo [EMAIL PROTECTED] wrote on 09/06/2005 20:37:43:
Janis Johnson [EMAIL PROTECTED] wrote:
It sounds as if there should be a check in target-supports.exp for
SSE2 support that determines whether the default test action is 'run'
or 'compile' for i686 targets.
I am not able
compilation failed to produce executable
Andreas
--
Andreas Jaeger, [EMAIL PROTECTED], http://www.suse.de/~aj
SUSE Linux Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
GPG fingerprint = 93A3 365E CE47 B889 DF7F FED1 389A 563C C272 A126
[attachment attdhcxd.dat deleted by Dorit Naishlos
Andreas Jaeger [EMAIL PROTECTED] wrote on 22/05/2005 17:29:24:
On Sun, May 22, 2005 at 05:25:13PM +0300, Dorit Naishlos wrote:
I also see these failures on powerpc-apple-darwin, but they are all
solved
when I apply Keith's patch:
http://gcc.gnu.org/ml/gcc-patches/2005-05/msg00803.html
GCC 4.1 is going rather well thus far.
Technically, Stage 1 ended on April 25th, though I failed to announce
that. There are a few stage 1 tasks that have not made it in yet,
according to the Wiki:
# Autovectorization Enhancements
Items 1.4, 2.1, 2.3 (1.3)
Items 1.4 and 2.3 are in,
Hi!
On mainline we now use loop versioning and peeling for alignment
for the following loop (-march=pentium4):
we don't yet use loop-versioning in the vectorizer in mainline (we do in
autovect). we do apply peeling.
void foo3(float * __restrict__ a, float * __restrict__ b,
float
On Mon, 21 Mar 2005 13:45:19 +0100 (CET), Richard Guenther
[EMAIL PROTECTED] wrote:
...
Uh, and with -funroll-loops we seem to be lost completely, as we
produce peeling/loops for a eight times four rolling loop! Where is
the information about the loop counter gone??
the thing is you
Steve Ellcey wrote:
Most of the gcc.dg/vect/* tests contain something like:
typedef float afloat __attribute__ ((__aligned__(16)));
afloat a[N];
It looks like what is really intended here is to apply the alignment to
the array type. The point is that the entire array has to
The remaining 1.1 projects include:
* Autovectorization Enhancements (some parts)
1.2 Incrementally preserve loop-closed form when vectorizing
Submitted today:
http://gcc.gnu.org/ml/gcc-patches/2005-03/msg01318.html
1.3 Improvements to peeling for alignment
Submitted today:
29 matches
Mail list logo