Hello,

I have been playing with gcc's new (to me) auto vectorization
optimizations. I have a particular loop for which I have made external
provisions to ensure that the data is 16-byte aligned. I have tried
everything I can think of to give gcc the hint that it is operating on
aligned data, but still the vectorizer warns that it is operating on
unaligned data and generates the less efficient MOVLPS/MOVUPS instead
of MOVAPS.

The code is like this:

#define SSE __attribute__((aligned (16)))

typedef float matrix_t[100][1024];

matrix_t aa SSE, bb SSE, cc SSE;

void calc(float *a, float *b, float *c) {
 int i, n = 1024;

 for (i=0; i<n; ++i) {
   a[i] = b[i] / (b[i] + c[i]);
 }

}

int main(void) {
 int i, n = 100;
 for (i=0; i<n; ++i) {
   calc(a[i], b[i], c[i]);
 }
}

gcc rejects if I specify alignment attributes on the formal parameters
(obviously it was dumb to even try that), and there does not seem to
be a way to get the alignment hint to apply to the object referenced
by the pointer instead of the pointer itself.

In my application it is important that the function doing the
computations remains abstracted away from the data definitions, as
there is over 1G of data dynamically arranged and the actual alignment
provisions are made with mmap().

Does anyone have a suggestion?

Regards,
Michael James

Reply via email to