[GitHub] [arrow] cpcloud commented on pull request #13442: ARROW-9612: [C++] increase default block_size from 1MB to 16MB

GitBox Thu, 30 Jun 2022 07:51:53 -0700


cpcloud commented on PR #13442:
URL: https://github.com/apache/arrow/pull/13442#issuecomment-1171320351


   > Hmm, sorry, perhaps I'm being dense but I don't understand how your 
question relates to what I said?
   
   No I'm probably not being clear.
   
   You said
   
   > IIRC we don't have any real-world JSON parsing benchmarks. There are some 
C++ micro-benchmarks, but I don't think they would help estimate scaling 
characteristics on large data files.
   
   If we don't have any real world benchmarks that can help estimate scaling 
characteristics on large files, then we should be able to tell whether the 16MB 
buffer size affects the micro benchmarks.
   
   In the case where a file is larger than 1MB, it's required to set the block 
size anyway, so I'm not following what the counter argument to this PR 
regarding the performance of large files. You get what you get because there's 
no other option just to have something that works at all.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [arrow] cpcloud commented on pull request #13442: ARROW-9612: [C++] increase default block_size from 1MB to 16MB

Reply via email to