One question here is: do we want to support datasets with more than 4G entries 
on 32bit systems? If so, how would this even be possible (since you cannot just 
fit that much data in any addressable memory chunk in Rust)? 

So I would say: usize is idiomic and supports large enough datasets on the 
system in question. So you get u64 on 64 bit systems and u32 on 32 bit systems. 

On December 7, 2018 4:05:34 PM GMT+01:00, Wes McKinney <wesmck...@gmail.com> 
wrote:
>Thanks for raising the issue, Paddy. In C++/Python/R we often work
>with vary large contiguous datasets, so having support for 64-bit
>lengths is important. If supporting this in Rust is not a hardship, I
>think it's a good idea.
>
>For IPC (shared memory) or RPC (Flight / gRPC), in many cases it would
>make sense to break things into smaller chunks. We have an interface
>to slice a table (which may be either contiguous or chunked
>internally) into chunks of a desired size (like 64K or similar)
>
>https://github.com/apache/arrow/blob/master/cpp/src/arrow/table.h#L266
>
>- Wes
>On Thu, Dec 6, 2018 at 8:20 PM paddy horan <paddyho...@hotmail.com>
>wrote:
>>
>> All,
>>
>> As part of the PR for ARROW-3347 there was a discussion regarding the
>type that should be used for anything that measures the length of an
>array, i.e.  len and capacity.
>>
>> The result of this discussion was that the Rust implementation should
>switch to using usize as the type for representing len and capacity. 
>This would mean supporting a way to split larger arrays into smaller
>array when passing data from one implementation to another.  The exact
>size of these smaller arrays would depend on the implementation you are
>passing data to.  C++ supports arrays up to size i64, but **all**
>implementations support lengths up to i32 as specified by the spec. 
>The full discussion is here:
>> https://github.com/apache/arrow/pull/2858
>>
>> This is not a major change so I’ll push it to 0.13 but I wanted to
>open up the discussion before making the change, the previous debate
>was hidden in a PR.  In particular, Andy and Chao are you in favor of
>this change?
>>
>> Paddy

Reply via email to