Thanks for raising the issue, Paddy. In C++/Python/R we often work with vary large contiguous datasets, so having support for 64-bit lengths is important. If supporting this in Rust is not a hardship, I think it's a good idea.
For IPC (shared memory) or RPC (Flight / gRPC), in many cases it would make sense to break things into smaller chunks. We have an interface to slice a table (which may be either contiguous or chunked internally) into chunks of a desired size (like 64K or similar) https://github.com/apache/arrow/blob/master/cpp/src/arrow/table.h#L266 - Wes On Thu, Dec 6, 2018 at 8:20 PM paddy horan <paddyho...@hotmail.com> wrote: > > All, > > As part of the PR for ARROW-3347 there was a discussion regarding the type > that should be used for anything that measures the length of an array, i.e. > len and capacity. > > The result of this discussion was that the Rust implementation should switch > to using usize as the type for representing len and capacity. This would > mean supporting a way to split larger arrays into smaller array when passing > data from one implementation to another. The exact size of these smaller > arrays would depend on the implementation you are passing data to. C++ > supports arrays up to size i64, but **all** implementations support lengths > up to i32 as specified by the spec. The full discussion is here: > https://github.com/apache/arrow/pull/2858 > > This is not a major change so I’ll push it to 0.13 but I wanted to open up > the discussion before making the change, the previous debate was hidden in a > PR. In particular, Andy and Chao are you in favor of this change? > > Paddy