[GitHub] [arrow] westonpace commented on pull request #10485: ARROW-13015: [C++] Create benchmark for file iteration

GitBox Wed, 09 Jun 2021 11:26:59 -0700


westonpace commented on pull request #10485:
URL: https://github.com/apache/arrow/pull/10485#issuecomment-857950376



   For context, I want to work on parallelizing the streaming CSV reader.   I'd 
like to investigate smaller block sizes for the earlier stages since they 
perform effectively random access and I was curious if keeping the data in 
~L2-sized chunks would work.  I didn't know if setting the block size that 
small would affect read speeds or if buffering the reads would help.  I'm also 
just kind of working through each stage (read, parse, decode) and getting a 
good grasp on the performance at each stage.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [arrow] westonpace commented on pull request #10485: ARROW-13015: [C++] Create benchmark for file iteration

Reply via email to