[jira] [Updated] (ARROW-11059) [Rust] [DataFusion] Implement extensible configuration mechanism

2021-04-11 Thread Andy Grove (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-11059?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andy Grove updated ARROW-11059:
---
Fix Version/s: (was: 4.0.0)

> [Rust] [DataFusion] Implement extensible configuration mechanism
> 
>
> Key: ARROW-11059
> URL: https://issues.apache.org/jira/browse/ARROW-11059
> Project: Apache Arrow
>  Issue Type: New Feature
>  Components: Rust - DataFusion
>Reporter: Andy Grove
>Assignee: Andy Grove
>Priority: Major
>
> We are getting to the point where there are multiple settings we could add to 
> operators to fine-tune performance. Custom operators provided by crates that 
> extend DataFusion may also need this capability.
> I propose that we add support for key-value configuration options so that we 
> don't need to plumb through each new configuration setting that we add.
> For example. I am about to start on a "coalesce batches" operator and I would 
> like a setting such as "coalesce.batch.size".
> For built-in settings like this we can provide information such as 
> documentation and default values and generate documentation from this.
> For example, here is how Spark defines configs:
> {code:java}
>   val PARQUET_VECTORIZED_READER_ENABLED =
> buildConf("spark.sql.parquet.enableVectorizedReader")
>   .doc("Enables vectorized parquet decoding.")
>   .version("2.0.0")
>   .booleanConf
>   .createWithDefault(true) {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (ARROW-11059) [Rust] [DataFusion] Implement extensible configuration mechanism

2020-12-31 Thread Andy Grove (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-11059?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andy Grove updated ARROW-11059:
---
Fix Version/s: (was: 3.0.0)
   4.0.0

> [Rust] [DataFusion] Implement extensible configuration mechanism
> 
>
> Key: ARROW-11059
> URL: https://issues.apache.org/jira/browse/ARROW-11059
> Project: Apache Arrow
>  Issue Type: New Feature
>  Components: Rust - DataFusion
>Reporter: Andy Grove
>Assignee: Andy Grove
>Priority: Major
> Fix For: 4.0.0
>
>
> We are getting to the point where there are multiple settings we could add to 
> operators to fine-tune performance. Custom operators provided by crates that 
> extend DataFusion may also need this capability.
> I propose that we add support for key-value configuration options so that we 
> don't need to plumb through each new configuration setting that we add.
> For example. I am about to start on a "coalesce batches" operator and I would 
> like a setting such as "coalesce.batch.size".
> For built-in settings like this we can provide information such as 
> documentation and default values and generate documentation from this.
> For example, here is how Spark defines configs:
> {code:java}
>   val PARQUET_VECTORIZED_READER_ENABLED =
> buildConf("spark.sql.parquet.enableVectorizedReader")
>   .doc("Enables vectorized parquet decoding.")
>   .version("2.0.0")
>   .booleanConf
>   .createWithDefault(true) {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)