westonpace commented on pull request #9656: URL: https://github.com/apache/arrow/pull/9656#issuecomment-812081517
Re: regression with ARROW-7001. I'm pretty sure I know what is going on here. It is serializing the reads so the files aren't processed in parallel (which explains why the regression is worse the more files you have). I suppose I will need to fix this in ARROW-7001 unless we want 7001 and 11772 to be a package deal (and presumably we've broken parquet as well). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org