[ 
https://issues.apache.org/jira/browse/HDFS-6134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14041008#comment-14041008
 ] 

Aaron T. Myers commented on HDFS-6134:
--------------------------------------

Sanjay, Steve - regarding distcp, Alejandro has already said the following, 
which I think addresses what both of you are getting at. Note the second 
paragraph:

{quote}
Vanilla distcp will just work with transparent encryption. Data will be 
decrypted on read and encrypted on write, assuming both source and target are 
in encrypted zones.

The proposal on changing distcp is to enable a second use used case, copy data 
from one cluster to another without having to decrypt/encrypt the data while 
doing the copy. This is useful when doing copies for disaster recovery, hdfs 
admins could do the copy without having to have access to the encryption keys.
{quote}

Sanjay:

bq. Turns out this issue come up in discussion with Owen, and he shares the 
concern and suggested that I post the concern. Besides even if Alejandro and 
Owen are in agreement, my question is relevant and has not been raised so far 
above: Encryption is used to overcome limitations of authorization and 
authentication in the system. It is relevant to ask if the use of delegation 
tokens to obtain keys adds weakness.

Transparent at-rest encryption is used to address other possible attack 
vectors, for example an admin removing hard drives from the cluster and looking 
at the data offline, or various attack vectors if network communication can be 
intercepted.

I was under the impression that Owen's concern was mostly around performance, 
i.e. that he didn't want all of the many tasks/containers in an MR/YARN job to 
each request the same encryption key(s) from the KMS at startup. I think that's 
quite reasonable, but it doesn't need to be an either/or thing - YARN jobs can 
request the appropriate keys upfront to address performance concerns _and_ the 
KMS can accept DTs for authentication to enable other use cases.

Regardless, I don't see how being able to request encryption keys via DTs adds 
any weakness. The DTs can only be granted via Kerberos-authenticated channels, 
and they expire, so they allow no more access than one can get via Kerberos. 
Could you perhaps elaborate on the specific concern there?

bq. Aaron .. you are misunderstanding my point. I am not saying that the 
discussion on this jira have not been open.<snip>

OK, good to hear. Sorry if I misinterpreted what you were saying.

bq. I am merely asking for one more meeting where I can quickly come up to 
speed on the context that Alejandro, Todd, Yi, Tianyou, Andrew, Atm, share. It 
will help me and others better understand the viewpoint that some of you share 
due to prevous high bandwidth meetings.

I'm certainly open to another meeting in the abstract to bring folks up to 
speed, but I'd still like to know what questions you have that haven't been 
addressed so far on the JIRA. So far I think that most of the questions you've 
been asking have already been discussed.

> Transparent data at rest encryption
> -----------------------------------
>
>                 Key: HDFS-6134
>                 URL: https://issues.apache.org/jira/browse/HDFS-6134
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>          Components: security
>    Affects Versions: 2.3.0
>            Reporter: Alejandro Abdelnur
>            Assignee: Alejandro Abdelnur
>         Attachments: HDFSDataatRestEncryptionProposal_obsolete.pdf, 
> HDFSEncryptionConceptualDesignProposal-2014-06-20.pdf
>
>
> Because of privacy and security regulations, for many industries, sensitive 
> data at rest must be in encrypted form. For example: the health­care industry 
> (HIPAA regulations), the card payment industry (PCI DSS regulations) or the 
> US government (FISMA regulations).
> This JIRA aims to provide a mechanism to encrypt HDFS data at rest that can 
> be used transparently by any application accessing HDFS via Hadoop Filesystem 
> Java API, Hadoop libhdfs C library, or WebHDFS REST API.
> The resulting implementation should be able to be used in compliance with 
> different regulation requirements.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to