[ 
https://issues.apache.org/jira/browse/HIVE-21894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16869476#comment-16869476
 ] 

Kristopher Kane commented on HIVE-21894:
----------------------------------------

[~bslim] Yes, this the Kafka method, but two things are missing within the 
context of Hive execution: 

1) Protection of the credentials for the keystore/truststore - will need Hadoop 
credentials similar to the JDBC handler and then add them to the consumer 
config once they are retrieved

2) That path is an absolute path to a linux file system only.  We cannot assume 
that the Hive user level would have the ability or infrastructure to distribute 
the keystore/truststore to every compute node or HS2 instance (I'm not actually 
sure when this would get loaded with regard to Hive bootstraping of the handler)




I wrote up my idea and some questions on the dev mailing list yesterday. I'll 
post it here too: 


{noformat}
I’m attempting to add SSL support to the Kafka storage handler and will need to 
make the keystore/truststore available to KafkaConsumer via an absolute path on 
a local file system.

The ideal steps are these:

  1.  TBLPROPERTIES describe an absolute HDFS path for the keystore/truststore
  2.  The Kafka storage handler copies both files from HDFS to the container’s 
local FS and configures and builds the KafkaConsumer around an absolute 
reference to this local YARN container directory.
  3.  Passwords for these files are stored using TBLPROPERTIES references to 
Hadoop credentials similar to the examples in the JDBC storage handler

Looks like there is precedent for interacting with HDFS from a storage handler 
level for HBase here: 
https://github.com/apache/hive/blob/master/hbase-handler/src/java/org/apache/hadoop/hive/hbase/HBaseSerDeHelper.java#L366,
 then in the Druid Storage handler there is a reference to using the local 
filesystem for dependency jar additions: 
https://github.com/apache/hive/blob/master/druid-handler/src/java/org/apache/hadoop/hive/druid/DruidStorageHandlerUtils.java#L692

Need some help with a few questions:

  1.  Is that the FS under the temporary YARN container example in the second 
link where all software/configuration is distributed?
  2.  Is this Storage handler execution scope at the ‘mapper’ level on a 
container running YARN?
  3.  Do these file movement steps seem ok and within a storage handler’s 
scope?{noformat}

> Hadoop credential password storage for the Kafka Storage handler when 
> security is SSL
> -------------------------------------------------------------------------------------
>
>                 Key: HIVE-21894
>                 URL: https://issues.apache.org/jira/browse/HIVE-21894
>             Project: Hive
>          Issue Type: Improvement
>          Components: kafka integration
>    Affects Versions: 4.0.0
>            Reporter: Kristopher Kane
>            Assignee: Kristopher Kane
>            Priority: Minor
>             Fix For: 4.0.0
>
>
> The Kafka storage handler assumes that if the Hive service is configured with 
> Kerberos then the destination Kafka cluster is also secured with the same 
> Kerberos realm or trust of realms.  The security configuration of the Kafka 
> client can be overwritten due to the additive operations of the Kafka client 
> configs, but, the only way to specify SSL and the keystore/truststore 
> user/pass is via plain text table properties. 
> This ticket proposes adding Hadoop credential security to the Kafka storage 
> handler in support of SSL secured Kafka clusters.  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to