On Feb 26, 2013, at 02:21, Claudio Freire wrote:

> I wouldn't assume. Even if they are in effect aligned, if the compiler
> doesn't know it (ie, if malloc doesn't mark them as such),
> vectorization will still assume out-of-alignment access.

I may be wrong, but if that were the case (and glue code were added to ensure 
proper alignment), auto-vectorisation should not in that case be able to 
provoke a crash on win32 because of ... incorrect alignment. And yet that 
happens (i.e. crashes).

> Architecture-mandated and SSE/2/3/MMX/Whatever alignment requirements
> tend to be different.

Of course, but as far as I have understood not in this case, because Apple 
makes such intensive use of SIMD throughout its APIs/SDKs.
> 
> You can write a very simple test case to check it out.

Done. More exactly, I was doing some comparisons of a hand-coded SIMD vs. a 
straightforward scalar version of functions I'd found when I discovered that 
gcc-4.7 has auto-vectorisation on by default (at least on OS X) because the 
scalar version was almost 2.5x faster than the SIMD version. That's what set 
the whole thing rolling, begging the question if there wouldn't be any gains 
(albeit undoubtedly smaller) to be had letting the compiler do its thing on the 
ffmpeg sources.

R
_______________________________________________
Libav-user mailing list
Libav-user@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/libav-user

Reply via email to