cyb70289 commented on PR #49756: URL: https://github.com/apache/arrow/pull/49756#issuecomment-4326326915
**Just for reference** Did a quick poke with **AI coding agent**. It analyzed the reason why Neon code is not inlined and proposed a fix to xsimd: [neon-bitcast-inline.patch](https://github.com/user-attachments/files/27121829/neon-bitcast-inline.patch) Unit test passed. Neon code is slightly faster than SVE128, matches expectation. I only tested one case. ``` # neon BM_UnpackBool/NeonUnaligned/1/32 6.56 ns 6.56 ns 107048724 items_per_second=4.87937G/s # sve128 BM_UnpackBool/Sve128Unaligned/1/32 7.06 ns 7.06 ns 99251620 items_per_second=4.53545G/s ``` I suspected xsimd bitcast Neon code may be too complicated for compiler to inline (maybe related to my old PR to fix an [issue](https://github.com/xtensor-stack/xsimd/issues/573), but I forgot details). Debug report from coding agent (I haven't read it carefully): [findings.md](https://github.com/user-attachments/files/27122154/findings.md) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
