------- Comment #2 from dorit at il dot ibm dot com 2005-12-15 12:41 ------- The problem is that the vectorizer applies loop-peeling in order to align the data reference *(m->c+i), and peeling only works correctly if the data is naturally aligned (aligned on it's type size). This is what the vectorizer currently blindly assumes, but on the Pentium4 doubles are not necessarily 64bit aligned.
Accidentally Devang and I discussed this issue last week, and Devang actually committed a patch to apple-ppc branch that works around the problem ( http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=108214). Devang's patch however will not fix this PR - the patch he committed disables vectorization if the vectorizer was able to compute the misalignment, and discovered that it doesn't evenly divide by the type size. In this testcase the misalignment is unknown at compile time. To fix this problem we need to disable loop-peeling in the vectorizer if we can't prove that the data is naturally aligned. Alternatively, if we can't prove either way we can peel the loop but control the number of iterations it will execute using a runtime test (i.e. have the prolog loop iterate the entire loop-count if at runtime we discover that the data is not naturally aligned). -- dorit at il dot ibm dot com changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |dorit at il dot ibm dot com http://gcc.gnu.org/bugzilla/show_bug.cgi?id=25413