------- Comment #14 from martin at mpa-garching dot mpg dot de 2010-06-09 12:06 ------- SSE performance is fine again, thanks a lot!
One more question, if that's OK... Depending on ARRSZ the testcase uses wildly varying amounts of CPU time; it's about half a second for ARRSZ=1024, but almost 10 seconds for ARRSZ=20 on my machine, which is extremely strange because the operation count is the same in both cases. I suspect that something weird is happening with respect to the cache and prefetching. Should I open another PR for this? -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44423