dmatth1 commented on PR #50030:
URL: https://github.com/apache/arrow/pull/50030#issuecomment-4574408369

   > If we add an xsimd implementation, I wonder if it is worth using it for 
Neon/SSE.
   > 
   > * On the one hand the current autovec works and is minimal to 
maintain/test.
   > * On the other hand autovec is a black box.
   > 
   > Though with a bit more work, the xsimd implementation could be generic and 
also support AVX512, SVE, and future targets too.
   > 
   > I have no intuition how xsimd compares to autovec. Given the compiler also 
optimizes xsimd's code, I'd say slightly better, but again it's possible (and 
it has been the case) some things are not properly expressed in xsimd as well.
   
   Measured a microbenchmark on my M1 macbook, probe-only (no hash) and 
in-cache:
   * `clang` autovec and xsimd are about the same in performance
   * `gcc 15` xsimd was 3x faster then autovec
   
   So I think there's a real argument to be made for using xsimd on Neon. We 
could use the dispatch array. Will increase the scope of this change a bit and 
I might lean towards addressing in a follow-up (I can create an issue) but 
whatever you guys think is best.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to