I definitely think that adding runtime SIMD dispatching for arithmetic
(and using xsimd to write generic kernels that can be cross-compiled
for different SIMD targets, i.e. AVX2, AVX512, NEON) functions is a
good idea and hopefully it will be pretty low hanging fruit (~a day or
two of work) to
To add on to Weston’s response, the only SIMD that will ever be generated for
the kernels by compilers at the moment is with SSE4.2, it will not generate
AVX2 as we have not set up the compiler flags to do that. Also the way the code
is written doesn’t seem super easy to vectorize for the
I see, thanks. I'll do more tests and dive into more arrow compute code.
Sent from my iPhone
> On Jun 9, 2022, at 5:30 PM, Weston Pace wrote:
>
>
>>
>> Hi, do you guys know which functions support vectorized SIMD in arrow
>> compute?
>
> I don't know that anyone has done a fully
> Hi, do you guys know which functions support vectorized SIMD in arrow compute?
I don't know that anyone has done a fully systematic analysis of which
kernels support and do not support SIMD at the moment. The kernels
are still in flux. There is an active effort to reduce overhead[1]
which is
Hi, do you guys know which functions support vectorized SIMD in arrow
compute? After a quick look as arrow compute cpp code, I only found very
little functions support vectorized SIMD:
● bloom filter: avx2 ● key compare: avx2 ● key hash: avx2 ● key map: avx2
Does scalar operation support
I see, the key for multiple loop is to ensure the data can be hold in l2 cache,
so that later
calculation can process this batch without reading from the main memory, and we
can record the exec stats for every batch , and do better local task scheduling
based on those stats. Thanks a lot.
There are a few levels of loops. Two at the moment and three in the
future. Some are fused and some are not. What we have right now is
early stages, is not ideal, and there are people investigating and
working on improvements. I can speak a little bit about where we want
to go. An example may
Hi Ion, thank you for your reply which recaps the history of arrow compute.
Those links are very valuable for me to understand arrow compute internal. I
took a quick for those documents and will take a deeper into those later. I
have another question, does arrow compute supports loop fusion,
Hi Shawn,
In March of 2021, when major work on the C++ query execution machinery
in Arrow was beginning, Wes sent a message [1] to the dev list and
linked to a doc [2] with some details about the planned design. A few
months later Neal sent an update [3] about this work. However those
documents
Hi, I'm considering using arrow compute as an execution kernel for our
distributed dataframe framework. I already read the great doc:
https://arrow.apache.org/docs/cpp/compute.html, but it is an usage doc. Is
there any design doc, inside introduction or benchmarks for arrow compute
so I can
10 matches
Mail list logo