trxcllnt commented on PR #438: URL: https://github.com/apache/arrow-js/pull/438#issuecomment-4548720106
> I couldn't tell whether you were pointing at a mechanism in the codebase that sidesteps the single-ArrayBuffer limit on child storage > ... > The remaining ceiling — JS's per-ArrayBuffer cap of ~2^32 bytes on a single contiguous child buffer — is allocation capacity, not interpretation, and it applies uniformly across List/LargeList/LargeUtf8/LargeBinary. Lifting it would mean a chunked-children redesign (child as Vector rather than single Data<U>), which is a substantial design change. As I previously expressed, I think that's a separate, more ambitious effort and deserves its own issue. As demonstrated by the example I pasted above, there is no uniform 4GiB cap on ArrayBuffer size in JS. The only limit to List child size is that the value offsets type is Int32, thus the child can only contain 2^31 elements. This means the ListVector could represent a column of entirely single-element lists, but there could only be 2^31 of them due to the offset type being Int32, not any limit in ArrayBuffer allocation size. > I focused on the wire-format parsing compliance angle, which is what a new commit I pushed addresses I don't think this commit is right. The reader should be zero-copy, and even if it was the correct thing to do, rebasing offsets when reading is not zero-copy. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
