Hello,
I found a very embarrassing problem that almost all array types do not
support insertion or append operations
The array abstractions are designed to be shared, potentially across FFI
boundaries, and therefore do no support mutation directly. That being
said there are a couple of options here:
- Mutable builders to construct arrays in place [1]
- Apply an operation to an array's contents [2]
- Collect into array from an iterator [3]
- Zero-copy conversion from Vec to Buffer [4]
- Fallible zero-copy conversion to builders from arrays [5]
There is also ongoing work to improve the ability to construct arrays
from their constituent parts [6].
Much of this is documented here [7] but we would always welcome
improvements to make this more clear or easier to find.
is not suitable for computation operations
The arrow crate provides a large number of computation operations on top
of and benefiting from the arrow data layout, both arrow-rs [8] and
DataFusion [9] should provide a variety of examples to crib from.
Perhaps if you shared your precise computation I might be able advise on
how to achieve this?
Please let me know if you have any questions.
Kind Regards,
Raphael Taylor-Davies
[1]: https://docs.rs/arrow-array/latest/arrow_array/builder/index.html
[2]:
https://docs.rs/arrow-array/latest/arrow_array/array/struct.PrimitiveArray.html#method.unary
[3]:
https://docs.rs/arrow-array/latest/arrow_array/array/struct.PrimitiveArray.html#impl-FromIterator%3CPtr%3E-for-PrimitiveArray%3CT%3E
[4]:
https://docs.rs/arrow-buffer/latest/arrow_buffer/buffer/struct.Buffer.html#method.from_vec
[5]:
https://docs.rs/arrow-array/latest/arrow_array/array/struct.PrimitiveArray.html#method.into_builder
[6]: https://github.com/apache/arrow-rs/issues/3879
[7]: https://docs.rs/arrow-array/latest/arrow_array/#building-an-array
[8]: https://github.com/apache/arrow-rs
[9]: https://github.com/apache/arrow-datafusion
On 13/04/2023 11:23, 叶思捷 wrote:
hello
Due to the high efficiency of arrow column storage, I was using
arrow-RS to write some algorithms, but I found a very embarrassing
problem that almost all array types do not support insertion or append
operations. This brings me great trouble when processing the data. I
can now create arrow-array only after I have finished the data
operation using vec. In terms of the process, this undoubtedly
increases the conversion cost of the two data types, and arrow-array
is not used in the calculation, which is somewhat inconsistent with my
original intention of using arrow.
I wonder if this will not change, and if arrow-array itself is not
suitable for computation operations, but is used primarily as a
serialization tool.
It would be appreciated if you could provide some use cases of others
in this regard for reference.
Best Regards,
Shark Yie