@Jacques Nadeau <jacq...@dremio.com> would have more background on this. Here's my understanding :
On Thu, Mar 14, 2019 at 12:08 PM Micah Kornfield <emkornfi...@gmail.com> wrote: > I was working on a proof of concept java implementation for LargeList [1] > implementation (64-bit array offsets). Our Java implementation doesn't > appear to support Vectors/Arrays larger then Integer.MAX_VALUE addressable > space. > > It looks like Message.fbs was updated quite a while ago to support 64-bit > lengths/offsets [2]. I had some questions: > > 1. For Java: > * Is my assessment accurate that is doesn't support 64-bit ranged sizes? > yes. > * Is there a desire to support the 64 bit sizes? (I didn't come across > any JIRAs when I did a search) > no, afaik. > * Is there a technical blocker for doing so? > - big change - arrow uses the netty allocator. that also uses int (32-bit) for capacity. https://netty.io/4.0/xref/io/netty/buffer/ByteBufAllocator.html#84 * Any thoughts on approach for doing such a large change (I'm mostly > concerned with breaking existing consumers/performance regressions)? > - Given that the Java code base appears relatively stable, it might be > that forking and creating a version "2.0" is the best viable option. > > 2. For other language implementations, is there support for 64-bit sizes > or only 32-bit? > > Thanks, > Micah > > P.S. It looks like our spec docs are out of date in regards to this issue, > they still list Int::MAX_VALUE as the largest possible array, it is on my > plate to update and consolidate them. > > [1] https://issues.apache.org/jira/browse/ARROW-4810 > [2] > > https://github.com/apache/arrow/commit/ced9d766d70e84c4d0542c6f5d9bd57faf10781d > -- Thanks and regards, Ravindra.