a-agmon commented on PR #588:
URL: https://github.com/apache/iceberg-rust/pull/588#issuecomment-2323298188

   > Aah I see! That makes complete sense. So, what you are saying is, we do 
the best we can to filter via metadata within iceberg, but whatever operations 
we can't handle in iceberg will get applied at row-level anyway once the data 
is passed back to DataFusion? If so, then I withdraw my objection as this seems 
totally sensible 👍
   
   Precisely!
   DataFusion is pretty fast in scanning and processing parquet files and 
record batches, but I think that the major performance boost that Iceberg can 
bring in is by using its metadata to filter out and prune data files. So we 
make best effort to prune using predicate and let DF handle the rest (some of 
my tests on huge tables show this to gain a great performance boost)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to