tustvold commented on issue #7983: URL: https://github.com/apache/arrow-rs/issues/7983#issuecomment-3112166311
Broadly speaking I agree with this, in fact my original proposal was for such a reader https://github.com/apache/arrow-rs/issues/1605 however the realities of the current code and various aspects of the parquet format meant we ended up with the current situation as a pragmatic hack. I'd love to see something better in this space. That being said, there are various aspects of the parquet format that make this rather difficult, and there's a fair amount of subtlety to how things like RecordReader work that rely on being able to see if there may be more pages. Fortunately we have relatively good test coverage of these quirks, so provided any rework was able to reuse these, we should avoid regressions. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org