lvheyang commented on a change in pull request #749:
URL: https://github.com/apache/arrow-datafusion/pull/749#discussion_r671872318



##########
File path: datafusion/src/datasource/parquet.rs
##########
@@ -38,11 +38,22 @@ pub struct ParquetTable {
     schema: SchemaRef,
     statistics: Statistics,
     max_concurrency: usize,
+    enable_pruning: bool,
 }
 
 impl ParquetTable {
     /// Attempt to initialize a new `ParquetTable` from a file path.
     pub fn try_new(path: impl Into<String>, max_concurrency: usize) -> 
Result<Self> {
+        ParquetTable::try_new_with_pruning_config(path, max_concurrency, true)
+    }
+
+    /// Attempt to initialize a new `ParquetTable` from a file path. And 
enable or
+    /// disable the parquet pruning features.
+    pub fn try_new_with_pruning_config(

Review comment:
       Here I'm not sure if adding the function is a good choice. 
   
   My concern is, it is a public function, there may be many users who rely on 
it. But the `enable_pruning` in the signature is somehow temporal, we don't 
want it to last for a long time.
   
   So I have another thought, replace this function with 
`try_new_with_config(path: impl Into<String>,, execution_config: 
ExecutionConfig)`. I think it's a better option, but it will introduce the 
dependency of `execution::context` module which I think is the top-level 
module. It seems a little weird. 
   
   I'm not sure if the second method is acceptable?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to