thinkharderdev commented on code in PR #3822: URL: https://github.com/apache/arrow-datafusion/pull/3822#discussion_r996289941
########## datafusion/core/src/physical_plan/file_format/parquet.rs: ########## @@ -191,15 +154,71 @@ impl ParquetExec { self } - /// Configure `ParquetScanOptions` - pub fn with_scan_options(mut self, scan_options: ParquetScanOptions) -> Self { - self.scan_options = scan_options; + /// If true, any filter [`Expr`]s on the scan will converted to a + /// [`RowFilter`](parquet::arrow::arrow_reader::RowFilter) in the + /// `ParquetRecordBatchStream`. These filters are applied by the + /// parquet decoder to skip unecessairly decoding other columns + /// which would not pass the predicate. Defaults to false + pub fn with_pushdown_filters(self, pushdown_filters: bool) -> Self { + self.base_config + .config_options + .write() + .set_bool(OPT_PARQUET_PUSHDOWN_FILTERS, pushdown_filters); + self + } + + /// Return the value described in [`Self::with_pushdown_filters`] + pub fn pushdown_filters(&self) -> bool { + self.base_config + .config_options + .read() + .get_bool(OPT_PARQUET_PUSHDOWN_FILTERS) + // default to false + .unwrap_or_default() + } + + /// If true, the `RowFilter` made by `pushdown_filters` may try to + /// minimize the cost of filter evaluation by reordering the + /// predicate [`Expr`]s. If false, the predicates are applied in + /// the same order as specified in the query. Defaults to false. + pub fn with_reorder_filters(self, reorder_filters: bool) -> Self { Review Comment: See above, wrapping the options in a `Arc<RwLock<_>>` seems strange since this is already essentially an owned value. ########## datafusion/core/src/physical_plan/file_format/mod.rs: ########## @@ -698,6 +699,7 @@ mod tests { projection, statistics, table_partition_cols, + config_options: ConfigOptions::new().into_shareable(), Review Comment: Not sure I understand why we need the `into_shareable` here. Seems like this should just be an owned `ConfigOptions` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org