[jira] [Updated] (ARROW-11059) [Rust] [DataFusion] Implement extensible configuration mechanism
[ https://issues.apache.org/jira/browse/ARROW-11059?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andy Grove updated ARROW-11059: --- Fix Version/s: (was: 4.0.0) > [Rust] [DataFusion] Implement extensible configuration mechanism > > > Key: ARROW-11059 > URL: https://issues.apache.org/jira/browse/ARROW-11059 > Project: Apache Arrow > Issue Type: New Feature > Components: Rust - DataFusion >Reporter: Andy Grove >Assignee: Andy Grove >Priority: Major > > We are getting to the point where there are multiple settings we could add to > operators to fine-tune performance. Custom operators provided by crates that > extend DataFusion may also need this capability. > I propose that we add support for key-value configuration options so that we > don't need to plumb through each new configuration setting that we add. > For example. I am about to start on a "coalesce batches" operator and I would > like a setting such as "coalesce.batch.size". > For built-in settings like this we can provide information such as > documentation and default values and generate documentation from this. > For example, here is how Spark defines configs: > {code:java} > val PARQUET_VECTORIZED_READER_ENABLED = > buildConf("spark.sql.parquet.enableVectorizedReader") > .doc("Enables vectorized parquet decoding.") > .version("2.0.0") > .booleanConf > .createWithDefault(true) {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ARROW-11059) [Rust] [DataFusion] Implement extensible configuration mechanism
[ https://issues.apache.org/jira/browse/ARROW-11059?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andy Grove updated ARROW-11059: --- Fix Version/s: (was: 3.0.0) 4.0.0 > [Rust] [DataFusion] Implement extensible configuration mechanism > > > Key: ARROW-11059 > URL: https://issues.apache.org/jira/browse/ARROW-11059 > Project: Apache Arrow > Issue Type: New Feature > Components: Rust - DataFusion >Reporter: Andy Grove >Assignee: Andy Grove >Priority: Major > Fix For: 4.0.0 > > > We are getting to the point where there are multiple settings we could add to > operators to fine-tune performance. Custom operators provided by crates that > extend DataFusion may also need this capability. > I propose that we add support for key-value configuration options so that we > don't need to plumb through each new configuration setting that we add. > For example. I am about to start on a "coalesce batches" operator and I would > like a setting such as "coalesce.batch.size". > For built-in settings like this we can provide information such as > documentation and default values and generate documentation from this. > For example, here is how Spark defines configs: > {code:java} > val PARQUET_VECTORIZED_READER_ENABLED = > buildConf("spark.sql.parquet.enableVectorizedReader") > .doc("Enables vectorized parquet decoding.") > .version("2.0.0") > .booleanConf > .createWithDefault(true) {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)