Github user arunmahadevan commented on a diff in the pull request:

    https://github.com/apache/storm/pull/2099#discussion_r116669975
  
    --- Diff: external/storm-hive/README.md ---
    @@ -101,6 +101,80 @@ Hive Trident state also follows similar pattern to 
HiveBolt it takes in HiveOpti
        TridentState state = stream.partitionPersist(factory, hiveFields, new 
HiveUpdater(), new Fields());
      ```
        
    +   
    +      
    +##Working with Secure Hive
    +If your topology is going to interact with secure Hive, your bolts/states 
needs to be authenticated by Hive Server. We 
    +currently have 2 options to support this:
    +
    +### Using keytabs on all worker hosts
    +If you have distributed the keytab files for hive user on all potential 
worker hosts then you can use this method. You should specify a 
    +hive configs using the methods HiveOptions.withKerberosKeytab(), 
HiveOptions.withKerberosPrincipal() methods.
    +
    +On worker hosts the bolt/trident-state code will use the keytab file with 
principal provided in the config to authenticate with 
    +Hive. This method is little dangerous as you need to ensure all workers 
have the keytab file at the same location and you need
    +to remember this as you bring up new hosts in the cluster.
    +
    +
    +### Using Hive MetaStore delegation tokens 
    +Your administrator can configure nimbus to automatically get delegation 
tokens on behalf of the topology submitter user.
    +Since Hive depends on HDFS, we should also configure HDFS delegation 
tokens.The nimbus should be started with following configurations:
    +
    +More details about Hadoop Tokens here: 
https://github.com/apache/storm/blob/master/docs/storm-hive.md
    +
    +```
    +nimbus.autocredential.plugins.classes : 
["org.apache.storm.hive.security.AutoHive", 
"org.apache.storm.hdfs.security.AutoHDFS"]
    +nimbus.credential.renewers.classes : 
["org.apache.storm.hive.security.AutoHive", 
"org.apache.storm.hdfs.security.AutoHDFS"]
    +nimbus.credential.renewers.freq.secs : 82800 (23 hours)
    +
    +hive.keytab.file: "/path/to/keytab/on/nimbus" (This is the keytab of hive 
super user that can impersonate other users.)
    +hive.kerberos.principal: "superu...@example.com"
    +hive.metastore.uris: "thrift://server:9083"
    +
    +//hdfs configs
    +hdfs.keytab.file: "/path/to/keytab/on/nimbus" (This is the keytab of hdfs 
super user that can impersonate other users.)
    +hdfs.kerberos.principal: "superu...@example.com" 
    +```
    +
    +Your topology configuration should have:
    +
    +```
    +topology.auto-credentials :["org.apache.storm.hive.security.AutoHive", 
"org.apache.storm.hdfs.security.AutoHDFS"]
    +```
    +
    +If nimbus did not have the above configuration you need to add and then 
restart it. Ensure the hadoop configuration 
    +files (core-site.xml, hdfs-site.xml and hive-site.xml) and the storm-hive 
connector jar with all the dependencies is present in nimbus's classpath.
    +
    +As an alternative to adding the configuration files (core-site.xml, 
hdfs-site.xml and hive-site.xml) to the classpath, you could specify the 
configurations
    +as a part of the topology configuration. E.g. in you custom storm.yaml (or 
-c option while submitting the topology),
    +
    +```
    +hiveCredentialsConfigKeys : ["cluster1", "cluster2"] (the hive clusters 
you want to fetch the tokens from)
    +cluster1: [{"config1": "value1", "config2": "value2", ... }] (A map of 
config key-values specific to cluster1)
    +cluster2: [{"config1": "value1", "hive.keytab.file": 
"/path/to/keytab/for/cluster2/on/nimubs", "hive.kerberos.principal": 
"cluster2u...@example.com", "hive.metastore.uris": "thrift://server:9083"}] 
(here along with other configs, we have custom keytab and principal for 
"cluster2" which will override the keytab/principal specified at topology level)
    --- End diff --
    
    cluster value should be a map. Take a look at hdfs, hbase docs which was 
fixed recently.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

Reply via email to