yordan-pavlov commented on a change in pull request #8917:
URL: https://github.com/apache/arrow/pull/8917#discussion_r543655758
##########
File path: rust/datafusion/src/datasource/parquet.rs
##########
@@ -65,6 +66,7 @@ impl TableProvider for ParquetTable {
&self,
projection: &Option<Vec<usize>>,
batch_size: usize,
+ _filters: &[Expr],
Review comment:
@returnString it's great to see someone else working on predicate
push-down as well;
I have been working on this for a couple of weeks, targeting an end-to-end
implementation for parquet and have done similar changes to the filter
push-down optimizer but your implementation is better because of the idea for
full vs partial filter push-down; in my version I have `predicate:
&Option<Expr>`, but `filters: &[Expr]` should work as well;
I think it makes sense to separate the generic support for predicate
push-down to the data source from the implementation for various data sources
such as parquet because each change will be fairly big so makes sense to split
into smaller changes;
regarding a parquet implementation of predicate push-down I have been
working on the idea of building arrays from the min / max statistics in row
groups and then reusing the existing physical expressions already implemented
in datafusion; I already have the code that builds statistics arrays, next
working on the expression evaluation - hopefully will have enough to start a PR
in the next couple of weeks;
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]