a-agmon commented on PR #588: URL: https://github.com/apache/iceberg-rust/pull/588#issuecomment-2323298188
> Aah I see! That makes complete sense. So, what you are saying is, we do the best we can to filter via metadata within iceberg, but whatever operations we can't handle in iceberg will get applied at row-level anyway once the data is passed back to DataFusion? If so, then I withdraw my objection as this seems totally sensible 👍 Precisely! DataFusion is pretty fast in scanning and processing parquet files and record batches, but I think that the major performance boost that Iceberg can bring in is by using its metadata to filter out and prune data files. So we make best effort to prune using predicate and let DF handle the rest (some of my tests on huge tables show this to gain a great performance boost) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org