lz19970205 commented on issue #10776:
URL: https://github.com/apache/arrow/issues/10776#issuecomment-885349007


   > List arrays and string arrays cannot have more than 2GB. This is because 
they are represented as two arrays. A values array and an offsets array.
   > 
   > ```
   >         0  1  2  3  4  5  6  7  8  9  10 11 12 13       
   > Values: s  t  r  i  n  g  1  s  t  r  i  n  g  2
   > Offsets: 0, 7, 14
   > ```
   > 
   > The offsets point to the beginning (and end) of each string. Since the 
offsets array is int32 the maximum offset is 2GB and so the values array cannot 
have more than 2GB bytes of values.
   > 
   > Normally, when this limit is hit, a good workaround is to split your data 
into smaller record batches (you can still represent it as a single table) but 
it will depend on what you are trying to do.
   
   Thanks for the reply!
   So you mean there is a big array in my data?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to