Andy Grove created ARROW-11059: ---------------------------------- Summary: [Rust] [DataFusion] Implement extensible configuration mechanism Key: ARROW-11059 URL: https://issues.apache.org/jira/browse/ARROW-11059 Project: Apache Arrow Issue Type: New Feature Components: Rust - DataFusion Reporter: Andy Grove Assignee: Andy Grove Fix For: 3.0.0
We are getting to the point where there are multiple settings we could add to operators to fine-tune performance. Custom operators provided by crates that extend DataFusion may also need this capability. I propose that we add support for key-value configuration options so that we don't need to plumb through each new configuration setting that we add. For example. I am about to start on a "coalesce batches" operator and I would like a setting such as "coalesce.batch.size". For built-in settings like this we can provide information such as documentation and default values and generate documentation from this. For example, here is how Spark defines configs: {code:java} val PARQUET_VECTORIZED_READER_ENABLED = buildConf("spark.sql.parquet.enableVectorizedReader") .doc("Enables vectorized parquet decoding.") .version("2.0.0") .booleanConf .createWithDefault(true) {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)