westonpace commented on pull request #9656:
URL: https://github.com/apache/arrow/pull/9656#issuecomment-812077030


   Also, thanks for doing all this.  It's nice to see some improvement at least 
in some cases.  Gives some good validation we aren't solving these tricky local 
cases for no reason.
   
   I also wouldn't worry too much about the low file cases.  They could, in 
theory, improve with better intra-file parallelism but we aren't taking a very 
long time here in the first case and I feel that intra-file parallelism will 
always be less efficient than intra-file parallelism because it is breaking up 
processing of the same data across multiple threads.  Something that can be 
overcome when the processing is expensive but maybe not so easily overcome 
here.  That theory isn't finalized though and I may be completely wrong.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to