GitHub user vanzin opened a pull request: https://github.com/apache/spark/pull/14022
[SPARK-16272][core] Allow config values to reference conf, env, system props. This allows configuration to be more flexible when the cluster does not have a homogeneous configuration (e.g. packages are installed on different paths in different nodes). By allowing one to reference the environment from the conf, it becomes possible to work around those in certain cases. The feature is hooked up to spark.sql.hive.metastore.jars to show how to use it, and because it's an example of said scenario that I ran into. It uses a new "pathConf" config type that is a shorthand for enabling variable expansion on string configs. As part of the implementation, ConfigEntry now keeps track of all "known" configs (i.e. those created through the use of ConfigBuilder), since that list is used by the resolution code. This duplicates some code in SQLConf, which could potentially be merged with this now. It will also make it simpler to implement some missing features such as filtering which configs show up in the UI or in event logs - which are not part of this change. Another change is in the way ConfigEntry reads config data; it now takes a string map and a function that reads env variables, so that it can be called both from SparkConf and SQLConf. This makes it so both places follow the same read path, instead of having to replicate certain logic in SQLConf. There are still a couple of methods in SQLConf that peek into fields of ConfigEntry directly, though. Tested via unit tests, and by using the new variable expansion functionality in a shell session with a custom spark.sql.hive.metastore.jars value. You can merge this pull request into a Git repository by running: $ git pull https://github.com/vanzin/spark SPARK-16272 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/14022.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #14022 ---- commit f4e772ad2b5c8231c48b2cd1f8f75c81e754c2b4 Author: Marcelo Vanzin <van...@cloudera.com> Date: 2016-06-28T22:39:49Z [SPARK-16272][core] Allow config values to reference conf, env, system props. This allows configuration to be more flexible when the cluster does not have a homogeneous configuration (e.g. packages are installed on different paths in different nodes). By allowing one to reference the environment from the conf, it becomes possible to work around those in certain cases. The feature is hooked up to spark.sql.hive.metastore.jars to show how to use it, and because it's an example of said scenario that I ran into. It uses a new "pathConf" config type that is a shorthand for enabling variable expansion on string configs. As part of the implementation, ConfigEntry now keeps track of all "known" configs (i.e. those created through the use of ConfigBuilder), since that list is used by the resolution code. This duplicates some code in SQLConf, which could potentially be merged with this now. It will also make it simpler to implement some missing features such as filtering which configs show up in the UI or in event logs - which are not part of this change. Another change is in the way ConfigEntry reads config data; it now takes a string map and a function that reads env variables, so that it can be called both from SparkConf and SQLConf. This makes it so both places follow the same read path, instead of having to replicate certain logic in SQLConf. There are still a couple of methods in SQLConf that peek into fields of ConfigEntry directly, though. Tested via unit tests, and by using the new variable expansion functionality in a shell session with a custom spark.sql.hive.metastore.jars value. ---- --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org