[ 
https://issues.apache.org/jira/browse/HADOOP-10150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13954994#comment-13954994
 ] 

Yi Liu commented on HADOOP-10150:
---------------------------------

[~tucu00], thanks for comments.
{quote}Regarding hflush, hsync. Unless I’m missing something, if the 
hflush/hsync is done at an offset which is not MOD of 16, things will break as 
the IV advancing is done on per encryption block (16 bytes).{quote}
Hflush/Hsync will work well in CFS.  The key point is in CTR mode, it could 
have some characteristics of stream cipher, such like encryption can be done 
for any size of data, and we can decrypt any random bytes, counter is 
calculated using the formula in our design doc.
{quote}The Cfs.getDataKey(), it is not clear how the master key is to be 
fetched by clients and by job tasks. Plus, it seems that the idea is that every 
client job task will get hold of the master key (which could decrypt all stored 
keys). {quote}
cfs.getDataKey() could be refactored to use Owen’s HADOOP-10141 key provider 
interface, thus decouple with the underlying KMS system. In the patch attached, 
we’d like to show the master key which served from client side could be used to 
decrypt the data encryption key. This client master key could be different from 
user to user. The master key can be retrieved from KMS as well and served via 
Owen’s HADOOP-10141 key provider interface as well, and it is pluggable and end 
user can provide his own implementation. The similar approach can be seen from 
Hadoop-9333 and MAPREDUCE-4491 which we have quite a lot discussion with @Benoy 
Antony.

{quote}Also, there is no provision to allow master key rotation.{quote}
Since the client master key is controlled by client, client is responsible for 
the key rotation.


> Hadoop cryptographic file system
> --------------------------------
>
>                 Key: HADOOP-10150
>                 URL: https://issues.apache.org/jira/browse/HADOOP-10150
>             Project: Hadoop Common
>          Issue Type: New Feature
>          Components: security
>    Affects Versions: 3.0.0
>            Reporter: Yi Liu
>            Assignee: Yi Liu
>              Labels: rhino
>             Fix For: 3.0.0
>
>         Attachments: CryptographicFileSystem.patch, HADOOP cryptographic file 
> system-V2.docx, HADOOP cryptographic file system.pdf, cfs.patch, extended 
> information based on INode feature.patch
>
>
> There is an increasing need for securing data when Hadoop customers use 
> various upper layer applications, such as Map-Reduce, Hive, Pig, HBase and so 
> on.
> HADOOP CFS (HADOOP Cryptographic File System) is used to secure data, based 
> on HADOOP “FilterFileSystem” decorating DFS or other file systems, and 
> transparent to upper layer applications. It’s configurable, scalable and fast.
> High level requirements:
> 1.    Transparent to and no modification required for upper layer 
> applications.
> 2.    “Seek”, “PositionedReadable” are supported for input stream of CFS if 
> the wrapped file system supports them.
> 3.    Very high performance for encryption and decryption, they will not 
> become bottleneck.
> 4.    Can decorate HDFS and all other file systems in Hadoop, and will not 
> modify existing structure of file system, such as namenode and datanode 
> structure if the wrapped file system is HDFS.
> 5.    Admin can configure encryption policies, such as which directory will 
> be encrypted.
> 6.    A robust key management framework.
> 7.    Support Pread and append operations if the wrapped file system supports 
> them.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to