ARROW-4810 [1] and ARROW-750 [2] discuss adding types with 64-bit offsets to Lists, Strings and binary data types.
Philipp started an implementation for the large list type [3] and I hacked together a potentially viable java implementation [4] I'd like to kickoff the discussion for getting these types voted on. I'm coupling them together because I think there are design consideration for how we evolve Schema.fbs There are two proposed options: 1. The current PR proposal which adds a new type LargeList: // List with 64-bit offsets table LargeList {} 2. As François suggested, it might cleaner to parameterize List with offset width. I suppose something like: table List { // only 32 bit and 64 bit is supported. bitWidth: int = 32; } I think Option 2 is cleaner and potentially better long-term, but I think it breaks forward compatibility of the existing arrow libraries. If we proceed with Option 2, I would advocate making the change to Schema.fbs all at once for all types (assuming we think that 64-bit offsets are desirable for all types) along with future compatibility checks to avoid multiple releases were future compatibility is broken (by broken I mean the inability to detect that an implementation is receiving data it can't read). What are peoples thoughts on this? Also, any other concern with adding these types? Thanks, Micah [1] https://issues.apache.org/jira/browse/ARROW-4810 [2] https://issues.apache.org/jira/browse/ARROW-750 [3] https://github.com/apache/arrow/pull/3848 [4] https://github.com/apache/arrow/commit/03956cac2202139e43404d7a994508080dc2cdd1