Thank you both Andrew and Dima,

It is very good to know the performance penalty is not too much, we will 
investigate HDFS and HSM, and of course, we will test the perf impact ourselves.
I think I have misunderstanding of the purpose of encryption, if NFS doesn't 
provide more protection. The major goal of encryption for me is when the data 
is physically lost, one cannot read it if he cannot get the key. So unless the 
NFS and the data disk lost to same person, it is safe. But I should really 
start to read about HSM.

Very appreciated of your help.
Ming

-----邮件原件-----
发件人: Andrew Purtell [mailto:andrew.purt...@gmail.com] 
发送时间: 2016年6月7日 0:37
收件人: user@hbase.apache.org
抄送: Zhang, Yi (Eason) <yi.zh...@esgyn.cn>
主题: Re: hbase 'transparent encryption' feature is production ready or not?

> if we move the encryption to HDFS level, we no longer can enable 
> encryption per table I think? I assume encryption will impact 
> performance to some extent, so we may would like to enable it per 
> table

That's correct, at the HDFS level encryption will be over the entire HBase 
data. I can offer my personal anecdote, which is HDFS encryption adds a modest 
read penalty and a write penalty that's hard to measure. This can be better 
than what you'll see using the default codecs provided with HBase encryption 
because the HDFS implementation can use native code acceleration - assuming you 
have the Hadoop native libraries properly built and available. 

> if for example, we can setup a separate storage, like a NFS, which can 
> be mounted to each node of HBase cluster, and we put the key there, is 
> it an acceptable plan

No, this won't provide any additional protection over using local keystore 
files. 

> On Jun 6, 2016, at 9:07 AM, Liu, Ming (Ming) <ming....@esgyn.cn> wrote:
> 
> Hi, Andrew again,
> 
> I still have a question that if we move the encryption to HDFS level, we no 
> longer can enable encryption per table I think? 
> I assume encryption will impact performance to some extent, so we may would 
> like to enable it per table. Is there any performance tests that shows how 
> much overhead encryption can introduce? If very small, then I am very happy 
> to do it in HDFS and encrypt all data.
> I still not start to study HSM, but if for example, we can setup a separate 
> storage, like a NFS, which can be mounted to each node of HBase cluster, and 
> we put the key there, is it an acceptable plan?
> 
> Thanks,
> Ming
> 
> -----邮件原件-----
> 发件人: Andrew Purtell [mailto:apurt...@apache.org]
> 发送时间: 2016年6月3日 12:27
> 收件人: user@hbase.apache.org
> 抄送: Zhang, Yi (Eason) <yi.zh...@esgyn.cn>
> 主题: Re: 答复: hbase 'transparent encryption' feature is production ready or not?
> 
>> We are now confident to use this feature.
> 
> You should test carefully for your use case in any case.
> 
>> HSM is a good option, I am new to it. But will look at it.
> 
> I recommend using HDFS's transparent encryption feature instead of 
> HBase transparent encryption if you're only just now thinking about 
> HSMs and key protection in general. Storing the master key on the same 
> nodes as the encrypted data will defeat protection. This should be 
> offloaded to a protected domain. Hadoop ships with a software KMS 
> that, while it has limitations, can be set up on a specially secured 
> server and HDFS TDE can take advantage of it. (HBase TDE doesn't 
> support the Hadoop KMS.)
> 
> Advice offered for what it's worth (smile)
> 
> 
>> On Thu, Jun 2, 2016 at 9:16 PM, Liu, Ming (Ming) <ming....@esgyn.cn> wrote:
>> 
>> Thank you Andrew!
>> 
>> What we hear must be rumor :-) We are now confident to use this feature.
>> 
>> HSM is a good option, I am new to it. But will look at it.
>> 
>> Thanks,
>> Ming
>> -----邮件原件-----
>> 发件人: Andrew Purtell [mailto:apurt...@apache.org]
>> 发送时间: 2016年6月3日 8:59
>> 收件人: user@hbase.apache.org
>> 抄送: Zhang, Yi (Eason) <yi.zh...@esgyn.cn>
>> 主题: Re: hbase 'transparent encryption' feature is production ready or not?
>> 
>>> We heard from various sources that it is not production ready before.
>> 
>> ​Said by whom, specifically? ​
>> 
>> ​> During our tests, we do find out it works not very stable, but 
>> probably due to our lack of experience of this feature.
>> 
>> If you have something repeatable, please consider filing a JIRA to 
>> report the problem.
>> 
>>> And, we now save the encryption key in the disk, so we were 
>>> wondering,
>> this is something not secure.
>> 
>> Data keys are encrypted with a master key which must be protected. 
>> The out of the box key provider stores the master key in a local keystore.
>> That's not sufficient protection. In a production environment you 
>> will want to use a HSM. Most (all?) HSMs support the keystore API. If 
>> that is not sufficient, our KeyProvider API is extensible for the 
>> solution you choose to employ in production.
>> 
>> ​Have you looked at HDFS transparent encryption?
>> 
>> https://hadoop.apache.org/docs/r2.7.2/hadoop-project-dist/hadoop-hdfs
>> / TransparentEncryption.html Because it works at the HDFS layer it's 
>> a more general solution. Be careful what version of Hadoop you use if 
>> opting for HDFS TDE, though. Pick the most recent release. Slightly 
>> older versions (like 2.6.0) had fatal bugs if used in conjunction 
>> with HBase.
>> 
>> 
>> 
>> On Thu, Jun 2, 2016 at 5:52 PM, Liu, Ming (Ming) <ming....@esgyn.cn>
>> wrote:
>> 
>>> Hi, all,
>>> 
>>> We are trying to deploy the 'transparent encryption' feature of 
>>> HBase , described in HBase reference guide:
>>> https://hbase.apache.org/book.html#hbase.encryption.server  , in our 
>>> product.
>>> We heard from various sources that it is not production ready before.
>>> 
>>> During our tests, we do find out it works not very stable, but 
>>> probably due to our lack of experience of this feature. It works 
>>> sometime, sometimes not work, and retry the same configuration, it 
>>> work again. We were using HBase 1.0.
>>> 
>>> Could anyone give us some information that this feature is already 
>>> stable and can be used in a production environment?
>>> 
>>> And, we now save the encryption key in the disk, so we were 
>>> wondering, this is something not secure. Since the key is at same 
>>> place with data, anyone can decode the data because if he/she can 
>>> access the data, he/she can access the key as well. Is there any 
>>> best practice about how to manage the key?
>>> 
>>> Thanks,
>>> Ming
>> 
>> 
>> --
>> Best regards,
>> 
>>   - Andy
>> 
>> Problems worthy of attack prove their worth by hitting back. - Piet 
>> Hein (via Tom White)
> 
> 
> 
> --
> Best regards,
> 
>   - Andy
> 
> Problems worthy of attack prove their worth by hitting back. - Piet 
> Hein (via Tom White)

Reply via email to