trxcllnt commented on PR #438:
URL: https://github.com/apache/arrow-js/pull/438#issuecomment-4548720106

   > I couldn't tell whether you were pointing at a mechanism in the codebase 
that sidesteps the single-ArrayBuffer limit on child storage
   > ...
   > The remaining ceiling — JS's per-ArrayBuffer cap of ~2^32 bytes on a 
single contiguous child buffer — is allocation capacity, not interpretation, 
and it applies uniformly across List/LargeList/LargeUtf8/LargeBinary. Lifting 
it would mean a chunked-children redesign (child as Vector rather than single 
Data<U>), which is a substantial design change. As I previously expressed, I 
think that's a separate, more ambitious effort and deserves its own issue.
   
   As demonstrated by the example I pasted above, there is no uniform 4GiB cap 
on ArrayBuffer size in JS. The only limit to List child size is that the value 
offsets type is Int32, thus the child can only contain 2^31 elements. This 
means the ListVector could represent a column of entirely single-element lists, 
but there could only be 2^31 of them due to the offset type being Int32, not 
any limit in ArrayBuffer allocation size.
   
   > I focused on the wire-format parsing compliance angle, which is what a new 
commit I pushed addresses
   
   I don't think this commit is right. The reader should be zero-copy, and even 
if it was the correct thing to do, rebasing offsets when reading is not 
zero-copy.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to