westonpace commented on issue #33759: URL: https://github.com/apache/arrow/issues/33759#issuecomment-1399046967
Hmm, backpressure should be applied then. Once you call `to_batches` it should start to read in the background. Eventually, at a certain point, it should stop reading because too much data has accumulated. This is normally around a few GB. You mention there are 13k fragments, just to confirm this is 13k files right? How large is each file? How many row groups are in each file? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org