GitHub user jerryshao opened a pull request: https://github.com/apache/spark/pull/14065
[SPARK-16342][YARN][WIP] Add a configurable token manager for Spark running on YARN ## What changes were proposed in this pull request? Add a configurable token manager for Spark on running on yarn. ### Current Problems ### 1. Supported token provider is hard-coded, currently only hdfs, hbase and hive are supported and it is impossible for user to add new token provider without code changes. 2. Also this problem exits in timely token renewer and updater. ### Changes In This Proposal ### In this proposal, to address the problems mentioned above and make the current code more cleaner and easier to understand, mainly has 3 changes: 1. Abstract a `ServiceTokenProvider` as well as `ServiceTokenRenewable` interface for token provider. Each service wants to communicate with Spark through token way needs to implement this interface. 2. Provide a `ConfigurableTokenManager` to manage all the register token providers, also token renewer and updater. Also this class offers the API for other modules to obtain tokens, get renewal interval and so on. 3. Implement 3 built-in token providers `HDFSTokenProvider`, `HiveTokenProvider` and `HBaseTokenProvider` to keep the same semantics as supported today. Whether to load in these built-in token providers is controlled by configuration "spark.yarn.security.tokens.${service}.enabled", by default for all the built-in token providers are loaded. ### Behavior Changes ### For the end user there's no behavior change, we still use the same configuration `spark.yarn.security.tokens.${service}.enabled` to decide which token provider is enabled (hbase or hive). For user implemented token provider (assume the name of token provider is "test") needs to add into this class should have two configurations: 1. `spark.yarn.security.tokens.test.enabled` to true 2. `spark.yarn.security.tokens.test.class` to the full qualified class name. So we still keep the same semantics as current code while add one new configuration. ### Current Status ### - [x] token provider interface and management framework. - [x] implement built-in token providers (hdfs, hbase, hive). - [ ] Coverage of unit test. - [ ] Integrated test with security cluster. ## How was this patch tested? Unit test and integrated test. Please suggest and review, any comment is greatly appreciated. You can merge this pull request into a Git repository by running: $ git pull https://github.com/jerryshao/apache-spark SPARK-16342 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/14065.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #14065 ---- commit 9e9311cd956eb0b2f900625b042c5c22d1016a08 Author: jerryshao <ss...@hortonworks.com> Date: 2016-07-01T09:40:41Z Add ConfigurableTokenManager initial commit commit 3aaf0706d71321c2c150e0ddd21fee1cd218a4e1 Author: jerryshao <ss...@hortonworks.com> Date: 2016-07-05T06:07:58Z Further change on ConfigurableTokenManager commit 9e8702140614f0551cad889e26f98ed36d7f6f15 Author: jerryshao <ss...@hortonworks.com> Date: 2016-07-05T09:18:43Z Some refactory works and unit test added commit 8c0821b2074799a05c3dbb448368b3f195eff661 Author: jerryshao <ss...@hortonworks.com> Date: 2016-07-06T06:38:29Z Add more unit tests commit 90f194e34ffd198d2b1ee5b04f24afd8c4454d90 Author: jerryshao <ss...@hortonworks.com> Date: 2016-07-06T07:01:31Z Add more comments ---- --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org