[ 
https://issues.apache.org/jira/browse/HDFS-5143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13821728#comment-13821728
 ] 

Owen O'Malley commented on HDFS-5143:
-------------------------------------

[~hitliuyi] In the design document, the IV was always 0, but in the comments 
you are suggesting putting a random IV in the start of the underlying file. I 
think that the security advantage of having a random IV is relatively small and 
we'd do better without it. It only protects against having multiple files with 
the same key and the same plain text  co-located in the file.

I think that putting it at the front of the file has a couple of disadvantages:
* Any read of the file has to read the beginning 16 bytes of the file.
* Block boundaries are offset from the expectation. This will cause MapReduce 
input splits to straddle blocks in cases that wouldn't otherwise require it.

I think we should always have an IV of 0 or alternatively encode it in the 
underlying filesystem's filenames. In particular, we could base 64 encode the 
IV and append it onto the filename. If we add 16 characters of base64 that 
would give use 96 bits of IV and it would be easy to strip off. It would look 
like:

cfs://hdfs@nn/dir1/dir2/file -> hdfs://nn/dir1/dir2/file_1234567890ABCDEF

> Hadoop cryptographic file system
> --------------------------------
>
>                 Key: HDFS-5143
>                 URL: https://issues.apache.org/jira/browse/HDFS-5143
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>          Components: security
>    Affects Versions: 3.0.0
>            Reporter: Yi Liu
>            Assignee: Yi Liu
>              Labels: rhino
>             Fix For: 3.0.0
>
>         Attachments: CryptographicFileSystem.patch, HADOOP cryptographic file 
> system.pdf
>
>
> There is an increasing need for securing data when Hadoop customers use 
> various upper layer applications, such as Map-Reduce, Hive, Pig, HBase and so 
> on.
> HADOOP CFS (HADOOP Cryptographic File System) is used to secure data, based 
> on HADOOP “FilterFileSystem” decorating DFS or other file systems, and 
> transparent to upper layer applications. It’s configurable, scalable and fast.
> High level requirements:
> 1.    Transparent to and no modification required for upper layer 
> applications.
> 2.    “Seek”, “PositionedReadable” are supported for input stream of CFS if 
> the wrapped file system supports them.
> 3.    Very high performance for encryption and decryption, they will not 
> become bottleneck.
> 4.    Can decorate HDFS and all other file systems in Hadoop, and will not 
> modify existing structure of file system, such as namenode and datanode 
> structure if the wrapped file system is HDFS.
> 5.    Admin can configure encryption policies, such as which directory will 
> be encrypted.
> 6.    A robust key management framework.
> 7.    Support Pread and append operations if the wrapped file system supports 
> them.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Reply via email to