dmatth1 commented on PR #50030: URL: https://github.com/apache/arrow/pull/50030#issuecomment-4574408369
> If we add an xsimd implementation, I wonder if it is worth using it for Neon/SSE. > > * On the one hand the current autovec works and is minimal to maintain/test. > * On the other hand autovec is a black box. > > Though with a bit more work, the xsimd implementation could be generic and also support AVX512, SVE, and future targets too. > > I have no intuition how xsimd compares to autovec. Given the compiler also optimizes xsimd's code, I'd say slightly better, but again it's possible (and it has been the case) some things are not properly expressed in xsimd as well. Measured a microbenchmark on my M1 macbook, probe-only (no hash) and in-cache: * `clang` autovec and xsimd are about the same in performance * `gcc 15` xsimd was 3x faster then autovec So I think there's a real argument to be made for using xsimd on Neon. We could use the dispatch array. Will increase the scope of this change a bit and I might lean towards addressing in a follow-up (I can create an issue) but whatever you guys think is best. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
