Another thing worth noting is that I believe Intel has put some effort into next gen (?) LLVM/Clang for autovectorizing into AVX2. It might be worth looking into as it uses a mask that allows the CPU to skip computations that would lead to no change, but I think it is only available on last gen Intel CPUs.

Also worth keeping in mind is that future versions of LLVM will have to deal with GCC extensions and perhaps also Clang pragmas. So maybe take a look at:

http://clang.llvm.org/docs/LanguageExtensions.html#vectors-and-extended-vectors

and

http://clang.llvm.org/docs/LanguageExtensions.html#extensions-for-loop-hint-optimizations

?

Reply via email to