I have picked an approach for reducing GET calls overall (from 2 footer + 1 actual data read to 1 GET call) for small files. For small files we can buffer the whole file instead of doing separated calls.
High level implementation - https://github.com/apache/iceberg/pull/16729 As similar changes to parquet-mr (like arrow-rs) can result into much cleaner approach here, providing hint then instead of 8 bytes it gets that much bytes so maybe no additional GET call to fetch footer. For same reason I have started discussion to figure best approach - https://lists.apache.org/thread/yb8nom3w2zplb703m0p052kcc1wwotrr Would appreciate inputs and feedback there Thanks -- Lakhyani Varun Indian Institute of Technology Roorkee
