[ https://issues.apache.org/jira/browse/HADOOP-10150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13970602#comment-13970602 ]
Steve Loughran commented on HADOOP-10150: ----------------------------------------- I like the document on attack vectors, including that on hardware and networking. If we're going down to that level, there's one more layer to consider: virtualized hadoop clusters. # even don't swap Memory could be swapped out by the host OS # pagefile secrets could be preserved after VM destruction # disks may not be wiped Fixes # Don't give a transient cluster access to keys needed to decrypt persistent data other than that needed by specific jobs # explore with your virtualization/cloud service provider what their VM and virtual disk security policies are: when do the virtual disks get wiped, and how rigorously. Other things to worry about # malicious DNs joining the cluster. Again, it's hard to block this in a cloud, as hostnames aren't known in advance (so you cant have them on the included host list). Fix: Use a VPN and not any datacentre-wide network. # fundamental security holes in core dependency libraries (OS & JVM layer). Keep your machines up to date, have mechanisms for renewing anbd revoking certificates,... > Hadoop cryptographic file system > -------------------------------- > > Key: HADOOP-10150 > URL: https://issues.apache.org/jira/browse/HADOOP-10150 > Project: Hadoop Common > Issue Type: New Feature > Components: security > Affects Versions: 3.0.0 > Reporter: Yi Liu > Assignee: Yi Liu > Labels: rhino > Fix For: 3.0.0 > > Attachments: CryptographicFileSystem.patch, HADOOP cryptographic file > system-V2.docx, HADOOP cryptographic file system.pdf, > HDFSDataAtRestEncryptionAlternatives.pdf, > HDFSDataatRestEncryptionAttackVectors.pdf, > HDFSDataatRestEncryptionProposal.pdf, cfs.patch, extended information based > on INode feature.patch > > > There is an increasing need for securing data when Hadoop customers use > various upper layer applications, such as Map-Reduce, Hive, Pig, HBase and so > on. > HADOOP CFS (HADOOP Cryptographic File System) is used to secure data, based > on HADOOP “FilterFileSystem” decorating DFS or other file systems, and > transparent to upper layer applications. It’s configurable, scalable and fast. > High level requirements: > 1. Transparent to and no modification required for upper layer > applications. > 2. “Seek”, “PositionedReadable” are supported for input stream of CFS if > the wrapped file system supports them. > 3. Very high performance for encryption and decryption, they will not > become bottleneck. > 4. Can decorate HDFS and all other file systems in Hadoop, and will not > modify existing structure of file system, such as namenode and datanode > structure if the wrapped file system is HDFS. > 5. Admin can configure encryption policies, such as which directory will > be encrypted. > 6. A robust key management framework. > 7. Support Pread and append operations if the wrapped file system supports > them. -- This message was sent by Atlassian JIRA (v6.2#6252)