Re: [C++][Compute] RFC: add SIMD support to C++ kernel

2020-03-20 Thread Antoine Pitrou
On Fri, 20 Mar 2020 10:56:51 +0800 Yibo Cai wrote: > I'm revisiting this old thread as I see some avx512 code merged recently[1]. > Code maintenance will be non-trivial if we want to cover more > hardware(sse/avx/avx512/neon/sve/...) and optimize more code in the future. > #ifdef is obviously

Re: [C++][Compute] RFC: add SIMD support to C++ kernel

2020-03-19 Thread Yibo Cai
Thanks Wes for quick response. Yes, inlining can be a problem for runtime dispatcher. It means we should take care of the whole loop[1], not the code inside the loop[2]. This may lead to some traps to developer. [1] https://github.com/apache/arrow/blob/master/cpp/src/arrow/util/bpacking.h#L3760

Re: [C++][Compute] RFC: add SIMD support to C++ kernel

2020-03-19 Thread Wes McKinney
hi Yibo, I agree with this, having #ifdef in many places in the codebase is not maintainable longer-term. As far as runtime dispatch, we could populate a function table of all machine-dependent functions once so then the dispatch isn't happening on each function. Or some similar strategy This

Re: [C++][Compute] RFC: add SIMD support to C++ kernel

2020-03-19 Thread Yibo Cai
I'm revisiting this old thread as I see some avx512 code merged recently[1]. Code maintenance will be non-trivial if we want to cover more hardware(sse/avx/avx512/neon/sve/...) and optimize more code in the future. #ifdef is obviously no-go. So I'm selling my proposal again :) - put all

Re: [C++][Compute] RFC: add SIMD support to C++ kernel

2019-12-24 Thread Wes McKinney
If we go the route of AOT-compilation of Gandiva kernels as an approach to generate a shared library with many kernels, we might indeed look at possibly generating a "fat" binary with runtime dispatch between AVX2-optimized vs. SSE <= 4.2 (or non-SIMD altogether) kernels. This is something we

Re: [C++][Compute] RFC: add SIMD support to C++ kernel

2019-12-20 Thread Antoine Pitrou
Hi, I would recommend against reinventing the wheel. It would be possible to reuse an existing C++ SIMD library. There are several of them (Vc, xsimd, libsimdpp...). Of course, "just use Gandiva" is another possible answer. Regards Antoine. Le 20/12/2019 à 08:32, Yibo Cai a écrit : > Hi,

[C++][Compute] RFC: add SIMD support to C++ kernel

2019-12-19 Thread Yibo Cai
Hi, I'm investigating SIMD support to C++ compute kernel(not gandiva). A typical case is the sum kernel[1]. Below tight loop can be easily optimized with SIMD. for (int64_t i = 0; i < length; i++) { local.sum += values[i]; } Compiler already does loop vectorization. But it's done at