Thanks for raising the issue, Paddy. In C++/Python/R we often work
with vary large contiguous datasets, so having support for 64-bit
lengths is important. If supporting this in Rust is not a hardship, I
think it's a good idea.

For IPC (shared memory) or RPC (Flight / gRPC), in many cases it would
make sense to break things into smaller chunks. We have an interface
to slice a table (which may be either contiguous or chunked
internally) into chunks of a desired size (like 64K or similar)

https://github.com/apache/arrow/blob/master/cpp/src/arrow/table.h#L266

- Wes
On Thu, Dec 6, 2018 at 8:20 PM paddy horan <paddyho...@hotmail.com> wrote:
>
> All,
>
> As part of the PR for ARROW-3347 there was a discussion regarding the type 
> that should be used for anything that measures the length of an array, i.e.  
> len and capacity.
>
> The result of this discussion was that the Rust implementation should switch 
> to using usize as the type for representing len and capacity.  This would 
> mean supporting a way to split larger arrays into smaller array when passing 
> data from one implementation to another.  The exact size of these smaller 
> arrays would depend on the implementation you are passing data to.  C++ 
> supports arrays up to size i64, but **all** implementations support lengths 
> up to i32 as specified by the spec.  The full discussion is here:
> https://github.com/apache/arrow/pull/2858
>
> This is not a major change so I’ll push it to 0.13 but I wanted to open up 
> the discussion before making the change, the previous debate was hidden in a 
> PR.  In particular, Andy and Chao are you in favor of this change?
>
> Paddy

Reply via email to