Re: [I] [C++][Parquet] Reserve Memory when Reading Binary Data Types [arrow]

via GitHub Tue, 16 Apr 2024 05:42:52 -0700


WillAyd commented on issue #41224:
URL: https://github.com/apache/arrow/issues/41224#issuecomment-2059002060


   > Wouldn't this depend on the behavior and read-batch-size? The cases can 
separate to issues below:
   
   These are great callouts. I see in the implementation that there is a 
`ByteArrayChunkedRecordReader` and a `ByteArrayDictionaryRecordReader`, and I 
assume this would only affect the former
   
   > Besides, should user use smaller batch size rather than read a whole 
table. I think this interface change would be harder than expected...
   
   I am (possibly mistakenly) under the impression that the metadata for each 
column in a parquet file is stored by chunk. But I do have a lot to learn here 
- will keep this in mind as I look at the code more closely. Thanks for the 
notes!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [I] [C++][Parquet] Reserve Memory when Reading Binary Data Types [arrow]

Reply via email to