GitHub user jerryshao opened a pull request:

    https://github.com/apache/spark/pull/14065

    [SPARK-16342][YARN][WIP] Add a configurable token manager for Spark running 
on YARN

    ## What changes were proposed in this pull request?
    
    Add a configurable token manager for Spark on running on yarn.
    
    ### Current Problems ###
    
    1. Supported token provider is hard-coded, currently only hdfs, hbase and 
hive are supported and it is impossible for user to add new token provider 
without code changes.
    2. Also this problem exits in timely token renewer and updater.
    
    ### Changes In This Proposal ###
    
    In this proposal, to address the problems mentioned above and make the 
current code more cleaner and easier to understand, mainly has 3 changes:
    
    1. Abstract a `ServiceTokenProvider` as well as `ServiceTokenRenewable` 
interface for token provider. Each service wants to communicate with Spark 
through token way needs to implement this interface. 
    2. Provide a `ConfigurableTokenManager` to manage all the register token 
providers, also token renewer and updater. Also this class offers the API for 
other modules to obtain tokens, get renewal interval and so on.
    3. Implement 3 built-in token providers `HDFSTokenProvider`, 
`HiveTokenProvider` and `HBaseTokenProvider` to keep the same semantics as 
supported today. Whether to load in these built-in token providers is 
controlled by configuration "spark.yarn.security.tokens.${service}.enabled", by 
default for all the built-in token providers are loaded.
    
    ### Behavior Changes ###
    
    For the end user there's no behavior change, we still use the same 
configuration `spark.yarn.security.tokens.${service}.enabled` to decide which 
token provider is enabled (hbase or hive).
    
    For user implemented token provider (assume the name of token provider is 
"test") needs to add into this class should have two configurations:
    
    1. `spark.yarn.security.tokens.test.enabled` to true
    2. `spark.yarn.security.tokens.test.class` to the full qualified class name.
    
    So we still keep the same semantics as current code while add one new 
configuration.
    
    ### Current Status ###
    
    - [x] token provider interface and management framework.
    - [x] implement built-in token providers (hdfs, hbase, hive).
    - [ ] Coverage of unit test.
    - [ ] Integrated test with security cluster.
    
    ## How was this patch tested?
    
    Unit test and integrated test.
    
    Please suggest and review, any comment is greatly appreciated.


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/jerryshao/apache-spark SPARK-16342

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/14065.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #14065
    
----
commit 9e9311cd956eb0b2f900625b042c5c22d1016a08
Author: jerryshao <ss...@hortonworks.com>
Date:   2016-07-01T09:40:41Z

    Add ConfigurableTokenManager initial commit

commit 3aaf0706d71321c2c150e0ddd21fee1cd218a4e1
Author: jerryshao <ss...@hortonworks.com>
Date:   2016-07-05T06:07:58Z

    Further change on ConfigurableTokenManager

commit 9e8702140614f0551cad889e26f98ed36d7f6f15
Author: jerryshao <ss...@hortonworks.com>
Date:   2016-07-05T09:18:43Z

    Some refactory works and unit test added

commit 8c0821b2074799a05c3dbb448368b3f195eff661
Author: jerryshao <ss...@hortonworks.com>
Date:   2016-07-06T06:38:29Z

    Add more unit tests

commit 90f194e34ffd198d2b1ee5b04f24afd8c4454d90
Author: jerryshao <ss...@hortonworks.com>
Date:   2016-07-06T07:01:31Z

    Add more comments

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to