[
https://issues.apache.org/jira/browse/HIVE-7934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14237779#comment-14237779
]
Xiaomeng Huang commented on HIVE-7934:
--------------------------------------
Hi [~chirag.aggarwal]
When I worked on this feature, KMS feature is not released in hadoop 2.5.X. So
I decided to write a generic crypto codec and key management in Hive. But
talked with some committer who familiar with Security in hadoop-common, it is a
few duplicated with the crypto codec in hadoop-common. I just have a look at
hadoop release notes, hadoop 2.6.0 seems like include KMS feature. HIVE-8049 is
the initial patch to implement Hive Column Level Encrpytion based on KMS in
hadoop-common. And HIVE-8252 and HIVE-8416 will be closed as duplicated. I will
update the patch of HIVE-8049 these days. Thanks for watching!
> Improve column level encryption with key management
> ---------------------------------------------------
>
> Key: HIVE-7934
> URL: https://issues.apache.org/jira/browse/HIVE-7934
> Project: Hive
> Issue Type: Improvement
> Reporter: Xiaomeng Huang
> Assignee: Xiaomeng Huang
> Priority: Minor
>
> Now HIVE-6329 is a framework of column level encryption/decryption. But the
> implementation in HIVE-6329 is just use Base64, it is not safe and have some
> problems:
> - Base64WriteOnly just be able to get the ciphertext from client for any
> users.
> - Base64Rewriter just be able to get plaintext from client for any users.
> I have an improvement based on HIVE-6329 using key management via kms.
> # setup kms and set kms-acls.xml (e.g. user1 and root has permission to get
> key)
> {code}
> <property>
> <name>hadoop.kms.acl.GET</name>
> <value>user1 root</value>
> <description>
> ACL for get-key-version and get-current-key operations.
> </description>
> </property>
> {code}
> # set hive-site.xml
> {code}
> <property>
> <name>hadoop.security.kms.uri</name>
> <value>http://localhost:16000/kms</value>
> </property>
> {code}
> # create an encrypted table
> {code}
> -- region-aes-column.q
> drop table region_aes_column;
> create table region_aes_column (r_regionkey int, r_name string) ROW FORMAT
> SERDE 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe'
> WITH SERDEPROPERTIES ('column.encode.columns'='r_name',
> 'column.encode.classname'='org.apache.hadoop.hive.serde2.aes.AESRewriter')
> STORED AS TEXTFILE TBLPROPERTIES("hive.encrypt.keynames"="hive.k1");
> insert overwrite table region_aes_column
> select
> r_regionkey, r_name
> from region;
> {code}
> # query table by different user, this is transparent to users. It is very
> convenient and don't need to set anything.
> {code}
> [root@huang1 hive_data]# hive
> hive> select * from region_aes_column;
> OK
> 0 AFRICA
> 1 AMERICA
> 2 ASIA
> 3 EUROPE
> 4 MIDDLE EAST
> Time taken: 0.9 seconds, Fetched: 5 row(s)
> [root@huang1 hive_data]# su user1
> [user1@huang1 hive_data]$ hive
> hive> select * from region_aes_column;
> OK
> 0 AFRICA
> 1 AMERICA
> 2 ASIA
> 3 EUROPE
> 4 MIDDLE EAST
> Time taken: 0.899 seconds, Fetched: 5 row(s)
> [root@huang1 hive_data]# su user2
> [user2@huang1 hive_data]$ hive
> hive> select * from region_aes_column;
> OK
> 0 RcQycWVD
> 1 Rc8lam9Bxg==
> 2 RdEpeQ==
> 3 Qdcyd3ZH
> 4 ScskfGpHp8KIIuY=
> Time taken: 0.749 seconds, Fetched: 5 row(s)
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)