[jira] [Commented] (HADOOP-10150) Hadoop cryptographic file system
[ https://issues.apache.org/jira/browse/HADOOP-10150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14156519#comment-14156519 ] Hudson commented on HADOOP-10150: - FAILURE: Integrated in Hadoop-Mapreduce-trunk #1914 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1914/]) Fix up CHANGES.txt for HDFS-6134, HADOOP-10150 and related JIRAs following merge to branch-2 (arp: rev 2ca93d1fbf0fdcd6b4b5a151261052ac106ac9e1) * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-common-project/hadoop-common/CHANGES.txt * hadoop-mapreduce-project/CHANGES.txt > Hadoop cryptographic file system > > > Key: HADOOP-10150 > URL: https://issues.apache.org/jira/browse/HADOOP-10150 > Project: Hadoop Common > Issue Type: New Feature > Components: security >Affects Versions: 3.0.0 >Reporter: Yi Liu >Assignee: Yi Liu > Labels: rhino > Fix For: 2.6.0 > > Attachments: CryptographicFileSystem.patch, HADOOP cryptographic file > system-V2.docx, HADOOP cryptographic file system.pdf, > HDFSDataAtRestEncryptionAlternatives.pdf, > HDFSDataatRestEncryptionAttackVectors.pdf, > HDFSDataatRestEncryptionProposal.pdf, cfs.patch, extended information based > on INode feature.patch > > > There is an increasing need for securing data when Hadoop customers use > various upper layer applications, such as Map-Reduce, Hive, Pig, HBase and so > on. > HADOOP CFS (HADOOP Cryptographic File System) is used to secure data, based > on HADOOP “FilterFileSystem” decorating DFS or other file systems, and > transparent to upper layer applications. It’s configurable, scalable and fast. > High level requirements: > 1.Transparent to and no modification required for upper layer > applications. > 2.“Seek”, “PositionedReadable” are supported for input stream of CFS if > the wrapped file system supports them. > 3.Very high performance for encryption and decryption, they will not > become bottleneck. > 4.Can decorate HDFS and all other file systems in Hadoop, and will not > modify existing structure of file system, such as namenode and datanode > structure if the wrapped file system is HDFS. > 5.Admin can configure encryption policies, such as which directory will > be encrypted. > 6.A robust key management framework. > 7.Support Pread and append operations if the wrapped file system supports > them. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-10150) Hadoop cryptographic file system
[ https://issues.apache.org/jira/browse/HADOOP-10150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14156403#comment-14156403 ] Hudson commented on HADOOP-10150: - FAILURE: Integrated in Hadoop-Hdfs-trunk #1889 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1889/]) Fix up CHANGES.txt for HDFS-6134, HADOOP-10150 and related JIRAs following merge to branch-2 (arp: rev 2ca93d1fbf0fdcd6b4b5a151261052ac106ac9e1) * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-common-project/hadoop-common/CHANGES.txt * hadoop-mapreduce-project/CHANGES.txt > Hadoop cryptographic file system > > > Key: HADOOP-10150 > URL: https://issues.apache.org/jira/browse/HADOOP-10150 > Project: Hadoop Common > Issue Type: New Feature > Components: security >Affects Versions: 3.0.0 >Reporter: Yi Liu >Assignee: Yi Liu > Labels: rhino > Fix For: 2.6.0 > > Attachments: CryptographicFileSystem.patch, HADOOP cryptographic file > system-V2.docx, HADOOP cryptographic file system.pdf, > HDFSDataAtRestEncryptionAlternatives.pdf, > HDFSDataatRestEncryptionAttackVectors.pdf, > HDFSDataatRestEncryptionProposal.pdf, cfs.patch, extended information based > on INode feature.patch > > > There is an increasing need for securing data when Hadoop customers use > various upper layer applications, such as Map-Reduce, Hive, Pig, HBase and so > on. > HADOOP CFS (HADOOP Cryptographic File System) is used to secure data, based > on HADOOP “FilterFileSystem” decorating DFS or other file systems, and > transparent to upper layer applications. It’s configurable, scalable and fast. > High level requirements: > 1.Transparent to and no modification required for upper layer > applications. > 2.“Seek”, “PositionedReadable” are supported for input stream of CFS if > the wrapped file system supports them. > 3.Very high performance for encryption and decryption, they will not > become bottleneck. > 4.Can decorate HDFS and all other file systems in Hadoop, and will not > modify existing structure of file system, such as namenode and datanode > structure if the wrapped file system is HDFS. > 5.Admin can configure encryption policies, such as which directory will > be encrypted. > 6.A robust key management framework. > 7.Support Pread and append operations if the wrapped file system supports > them. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-10150) Hadoop cryptographic file system
[ https://issues.apache.org/jira/browse/HADOOP-10150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14156325#comment-14156325 ] Hudson commented on HADOOP-10150: - FAILURE: Integrated in Hadoop-Yarn-trunk #698 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/698/]) Fix up CHANGES.txt for HDFS-6134, HADOOP-10150 and related JIRAs following merge to branch-2 (arp: rev 2ca93d1fbf0fdcd6b4b5a151261052ac106ac9e1) * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-common-project/hadoop-common/CHANGES.txt * hadoop-mapreduce-project/CHANGES.txt > Hadoop cryptographic file system > > > Key: HADOOP-10150 > URL: https://issues.apache.org/jira/browse/HADOOP-10150 > Project: Hadoop Common > Issue Type: New Feature > Components: security >Affects Versions: 3.0.0 >Reporter: Yi Liu >Assignee: Yi Liu > Labels: rhino > Fix For: 2.6.0 > > Attachments: CryptographicFileSystem.patch, HADOOP cryptographic file > system-V2.docx, HADOOP cryptographic file system.pdf, > HDFSDataAtRestEncryptionAlternatives.pdf, > HDFSDataatRestEncryptionAttackVectors.pdf, > HDFSDataatRestEncryptionProposal.pdf, cfs.patch, extended information based > on INode feature.patch > > > There is an increasing need for securing data when Hadoop customers use > various upper layer applications, such as Map-Reduce, Hive, Pig, HBase and so > on. > HADOOP CFS (HADOOP Cryptographic File System) is used to secure data, based > on HADOOP “FilterFileSystem” decorating DFS or other file systems, and > transparent to upper layer applications. It’s configurable, scalable and fast. > High level requirements: > 1.Transparent to and no modification required for upper layer > applications. > 2.“Seek”, “PositionedReadable” are supported for input stream of CFS if > the wrapped file system supports them. > 3.Very high performance for encryption and decryption, they will not > become bottleneck. > 4.Can decorate HDFS and all other file systems in Hadoop, and will not > modify existing structure of file system, such as namenode and datanode > structure if the wrapped file system is HDFS. > 5.Admin can configure encryption policies, such as which directory will > be encrypted. > 6.A robust key management framework. > 7.Support Pread and append operations if the wrapped file system supports > them. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-10150) Hadoop cryptographic file system
[ https://issues.apache.org/jira/browse/HADOOP-10150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14155023#comment-14155023 ] Hudson commented on HADOOP-10150: - FAILURE: Integrated in Hadoop-trunk-Commit #6163 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/6163/]) Fix up CHANGES.txt for HDFS-6134, HADOOP-10150 and related JIRAs following merge to branch-2 (arp: rev 2ca93d1fbf0fdcd6b4b5a151261052ac106ac9e1) * hadoop-mapreduce-project/CHANGES.txt * hadoop-common-project/hadoop-common/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt > Hadoop cryptographic file system > > > Key: HADOOP-10150 > URL: https://issues.apache.org/jira/browse/HADOOP-10150 > Project: Hadoop Common > Issue Type: New Feature > Components: security >Affects Versions: 3.0.0 >Reporter: Yi Liu >Assignee: Yi Liu > Labels: rhino > Fix For: 2.6.0 > > Attachments: CryptographicFileSystem.patch, HADOOP cryptographic file > system-V2.docx, HADOOP cryptographic file system.pdf, > HDFSDataAtRestEncryptionAlternatives.pdf, > HDFSDataatRestEncryptionAttackVectors.pdf, > HDFSDataatRestEncryptionProposal.pdf, cfs.patch, extended information based > on INode feature.patch > > > There is an increasing need for securing data when Hadoop customers use > various upper layer applications, such as Map-Reduce, Hive, Pig, HBase and so > on. > HADOOP CFS (HADOOP Cryptographic File System) is used to secure data, based > on HADOOP “FilterFileSystem” decorating DFS or other file systems, and > transparent to upper layer applications. It’s configurable, scalable and fast. > High level requirements: > 1.Transparent to and no modification required for upper layer > applications. > 2.“Seek”, “PositionedReadable” are supported for input stream of CFS if > the wrapped file system supports them. > 3.Very high performance for encryption and decryption, they will not > become bottleneck. > 4.Can decorate HDFS and all other file systems in Hadoop, and will not > modify existing structure of file system, such as namenode and datanode > structure if the wrapped file system is HDFS. > 5.Admin can configure encryption policies, such as which directory will > be encrypted. > 6.A robust key management framework. > 7.Support Pread and append operations if the wrapped file system supports > them. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-10150) Hadoop cryptographic file system
[ https://issues.apache.org/jira/browse/HADOOP-10150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14120262#comment-14120262 ] Terence Spies commented on HADOOP-10150: After looking at the encryption proposal, I’m very concerned about the security of the chosen mechanism. As I understand it, the idea is to use AES in counter mode to encrypt the data (which offers no integrity protection) and rely on the existing CRC32 checksums to detect data tampering. The problem here is that the CRC32 checksums are unkeyed, and are quite easy to defeat by an active attacker. The net result is that the attacker can, even through they cannot know the value of individual bits, can trivially flip the value of any bit they desire in the file. This may be detected by the CRC32 checksum, but it’s not difficult to defeat this checksum mechanism by making trial bit flips to compensate. At a bare minimum, the checksum mechanism should be replaced with a HMAC/CMAC based keyed checksum mechanism. (The one thing I could not find is if the checksum file is encrypted — if it isn’t, then the plaintext checksums leak information about the underlying plaintext. HMAC/CMACs would prevent that from happening.) In general, not using an authenticated encryption mode is pretty dangerous here. Separated MACs enable detect of tampering, but the mechanism needs to be carefully implemented to prevent attackers from using error or timing information to insert tampered data into files. I understand the desire to keep the file seekable, and also not change the size of the underlying file. My suggestion would be stay with a mode that encrypts and decrypts as a block cipher, but keep the block size small enough that you can seek with a smallish buffer. In terms of file size, most of the block modes will support ciphertext stealing, which enables the block size to be changed to whatever byte size is required. One suggestion would be to look at a mode like OCB, which gives an authenticated mode with very little overhead, and also supports associated data. The associated data feature would enable the block position and IV to be incorporated, giving seekability. As an example, if the file was encrypted with a 128 byte block size, the associated data (similarly to the CTR mode index) would be the position of the block within the file and the IV for the file. This would also have the upside of producing an authentication tag for each block, which could at some point be added to some metadata to give cryptographic integrity. Note that we would also want to turn off the CRC32 checksums, as they would leak data about the underlying plaintext. > Hadoop cryptographic file system > > > Key: HADOOP-10150 > URL: https://issues.apache.org/jira/browse/HADOOP-10150 > Project: Hadoop Common > Issue Type: New Feature > Components: security >Affects Versions: 3.0.0 >Reporter: Yi Liu >Assignee: Yi Liu > Labels: rhino > Fix For: 2.6.0 > > Attachments: CryptographicFileSystem.patch, HADOOP cryptographic file > system-V2.docx, HADOOP cryptographic file system.pdf, > HDFSDataAtRestEncryptionAlternatives.pdf, > HDFSDataatRestEncryptionAttackVectors.pdf, > HDFSDataatRestEncryptionProposal.pdf, cfs.patch, extended information based > on INode feature.patch > > > There is an increasing need for securing data when Hadoop customers use > various upper layer applications, such as Map-Reduce, Hive, Pig, HBase and so > on. > HADOOP CFS (HADOOP Cryptographic File System) is used to secure data, based > on HADOOP “FilterFileSystem” decorating DFS or other file systems, and > transparent to upper layer applications. It’s configurable, scalable and fast. > High level requirements: > 1.Transparent to and no modification required for upper layer > applications. > 2.“Seek”, “PositionedReadable” are supported for input stream of CFS if > the wrapped file system supports them. > 3.Very high performance for encryption and decryption, they will not > become bottleneck. > 4.Can decorate HDFS and all other file systems in Hadoop, and will not > modify existing structure of file system, such as namenode and datanode > structure if the wrapped file system is HDFS. > 5.Admin can configure encryption policies, such as which directory will > be encrypted. > 6.A robust key management framework. > 7.Support Pread and append operations if the wrapped file system supports > them. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HADOOP-10150) Hadoop cryptographic file system
[ https://issues.apache.org/jira/browse/HADOOP-10150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14115426#comment-14115426 ] Hudson commented on HADOOP-10150: - FAILURE: Integrated in Hadoop-Mapreduce-trunk #1880 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1880/]) Fix up CHANGES.txt for HDFS-6134, HADOOP-10150 and related JIRAs following merge to branch-2 (tucu: rev d9a7404c389ea1adffe9c13f7178b54678577b56) * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-mapreduce-project/CHANGES.txt * hadoop-common-project/hadoop-common/CHANGES.txt > Hadoop cryptographic file system > > > Key: HADOOP-10150 > URL: https://issues.apache.org/jira/browse/HADOOP-10150 > Project: Hadoop Common > Issue Type: New Feature > Components: security >Affects Versions: 3.0.0 >Reporter: Yi Liu >Assignee: Yi Liu > Labels: rhino > Fix For: 2.6.0 > > Attachments: CryptographicFileSystem.patch, HADOOP cryptographic file > system-V2.docx, HADOOP cryptographic file system.pdf, > HDFSDataAtRestEncryptionAlternatives.pdf, > HDFSDataatRestEncryptionAttackVectors.pdf, > HDFSDataatRestEncryptionProposal.pdf, cfs.patch, extended information based > on INode feature.patch > > > There is an increasing need for securing data when Hadoop customers use > various upper layer applications, such as Map-Reduce, Hive, Pig, HBase and so > on. > HADOOP CFS (HADOOP Cryptographic File System) is used to secure data, based > on HADOOP “FilterFileSystem” decorating DFS or other file systems, and > transparent to upper layer applications. It’s configurable, scalable and fast. > High level requirements: > 1.Transparent to and no modification required for upper layer > applications. > 2.“Seek”, “PositionedReadable” are supported for input stream of CFS if > the wrapped file system supports them. > 3.Very high performance for encryption and decryption, they will not > become bottleneck. > 4.Can decorate HDFS and all other file systems in Hadoop, and will not > modify existing structure of file system, such as namenode and datanode > structure if the wrapped file system is HDFS. > 5.Admin can configure encryption policies, such as which directory will > be encrypted. > 6.A robust key management framework. > 7.Support Pread and append operations if the wrapped file system supports > them. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10150) Hadoop cryptographic file system
[ https://issues.apache.org/jira/browse/HADOOP-10150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14115254#comment-14115254 ] Hudson commented on HADOOP-10150: - FAILURE: Integrated in Hadoop-Hdfs-trunk #1854 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1854/]) Fix up CHANGES.txt for HDFS-6134, HADOOP-10150 and related JIRAs following merge to branch-2 (tucu: rev d9a7404c389ea1adffe9c13f7178b54678577b56) * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-mapreduce-project/CHANGES.txt * hadoop-common-project/hadoop-common/CHANGES.txt > Hadoop cryptographic file system > > > Key: HADOOP-10150 > URL: https://issues.apache.org/jira/browse/HADOOP-10150 > Project: Hadoop Common > Issue Type: New Feature > Components: security >Affects Versions: 3.0.0 >Reporter: Yi Liu >Assignee: Yi Liu > Labels: rhino > Fix For: 2.6.0 > > Attachments: CryptographicFileSystem.patch, HADOOP cryptographic file > system-V2.docx, HADOOP cryptographic file system.pdf, > HDFSDataAtRestEncryptionAlternatives.pdf, > HDFSDataatRestEncryptionAttackVectors.pdf, > HDFSDataatRestEncryptionProposal.pdf, cfs.patch, extended information based > on INode feature.patch > > > There is an increasing need for securing data when Hadoop customers use > various upper layer applications, such as Map-Reduce, Hive, Pig, HBase and so > on. > HADOOP CFS (HADOOP Cryptographic File System) is used to secure data, based > on HADOOP “FilterFileSystem” decorating DFS or other file systems, and > transparent to upper layer applications. It’s configurable, scalable and fast. > High level requirements: > 1.Transparent to and no modification required for upper layer > applications. > 2.“Seek”, “PositionedReadable” are supported for input stream of CFS if > the wrapped file system supports them. > 3.Very high performance for encryption and decryption, they will not > become bottleneck. > 4.Can decorate HDFS and all other file systems in Hadoop, and will not > modify existing structure of file system, such as namenode and datanode > structure if the wrapped file system is HDFS. > 5.Admin can configure encryption policies, such as which directory will > be encrypted. > 6.A robust key management framework. > 7.Support Pread and append operations if the wrapped file system supports > them. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10150) Hadoop cryptographic file system
[ https://issues.apache.org/jira/browse/HADOOP-10150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14115141#comment-14115141 ] Hudson commented on HADOOP-10150: - FAILURE: Integrated in Hadoop-Yarn-trunk #663 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/663/]) Fix up CHANGES.txt for HDFS-6134, HADOOP-10150 and related JIRAs following merge to branch-2 (tucu: rev d9a7404c389ea1adffe9c13f7178b54678577b56) * hadoop-common-project/hadoop-common/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-mapreduce-project/CHANGES.txt > Hadoop cryptographic file system > > > Key: HADOOP-10150 > URL: https://issues.apache.org/jira/browse/HADOOP-10150 > Project: Hadoop Common > Issue Type: New Feature > Components: security >Affects Versions: 3.0.0 >Reporter: Yi Liu >Assignee: Yi Liu > Labels: rhino > Fix For: 2.6.0 > > Attachments: CryptographicFileSystem.patch, HADOOP cryptographic file > system-V2.docx, HADOOP cryptographic file system.pdf, > HDFSDataAtRestEncryptionAlternatives.pdf, > HDFSDataatRestEncryptionAttackVectors.pdf, > HDFSDataatRestEncryptionProposal.pdf, cfs.patch, extended information based > on INode feature.patch > > > There is an increasing need for securing data when Hadoop customers use > various upper layer applications, such as Map-Reduce, Hive, Pig, HBase and so > on. > HADOOP CFS (HADOOP Cryptographic File System) is used to secure data, based > on HADOOP “FilterFileSystem” decorating DFS or other file systems, and > transparent to upper layer applications. It’s configurable, scalable and fast. > High level requirements: > 1.Transparent to and no modification required for upper layer > applications. > 2.“Seek”, “PositionedReadable” are supported for input stream of CFS if > the wrapped file system supports them. > 3.Very high performance for encryption and decryption, they will not > become bottleneck. > 4.Can decorate HDFS and all other file systems in Hadoop, and will not > modify existing structure of file system, such as namenode and datanode > structure if the wrapped file system is HDFS. > 5.Admin can configure encryption policies, such as which directory will > be encrypted. > 6.A robust key management framework. > 7.Support Pread and append operations if the wrapped file system supports > them. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10150) Hadoop cryptographic file system
[ https://issues.apache.org/jira/browse/HADOOP-10150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14105846#comment-14105846 ] Hudson commented on HADOOP-10150: - SUCCESS: Integrated in Hadoop-Mapreduce-trunk #1870 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1870/]) Fix up CHANGES.txt for HDFS-6134, HADOOP-10150 and related JIRAs. (wang: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1619203) * /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES-fs-encryption.txt * /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES-fs-encryption.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES-fs-encryption.txt * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt HDFS-6134 and HADOOP-10150 subtasks. Merge fs-encryption branch to trunk. (wang: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1619197) * /hadoop/common/trunk * /hadoop/common/trunk/BUILDING.txt * /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES-fs-encryption.txt * /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt * /hadoop/common/trunk/hadoop-common-project/hadoop-common/pom.xml * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/CMakeLists.txt * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/config.h.cmake * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/docs * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/crypto/AesCtrCryptoCodec.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/crypto/CipherSuite.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/crypto/CryptoCodec.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/crypto/CryptoInputStream.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/crypto/CryptoOutputStream.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/crypto/CryptoStreamUtils.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/crypto/Decryptor.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/crypto/Encryptor.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/crypto/JceAesCtrCryptoCodec.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/crypto/OpensslAesCtrCryptoCodec.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/crypto/OpensslCipher.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/crypto/random * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/CommonConfigurationKeys.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/CommonConfigurationKeysPublic.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/FSDataOutputStream.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/FileEncryptionInfo.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/crypto * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/shell/CommandWithDestination.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/shell/CopyCommands.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/NativeCodeLoader.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/NativeLibraryChecker.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/native/src/org/apache/hadoop/crypto * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/native/src/org/apache/hadoop/util/NativeCodeLoader.c * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/resources/core-default.xml * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/site/apt/FileSystemShell.apt.vm * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/core * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/crypto/CryptoStreamsTestBase.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/crypto/TestCryptoCodec.java * /hadoop/common/trunk/hadoop-common-project/hadoop-commo
[jira] [Commented] (HADOOP-10150) Hadoop cryptographic file system
[ https://issues.apache.org/jira/browse/HADOOP-10150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14105778#comment-14105778 ] Hudson commented on HADOOP-10150: - FAILURE: Integrated in Hadoop-Hdfs-trunk #1844 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1844/]) Fix up CHANGES.txt for HDFS-6134, HADOOP-10150 and related JIRAs. (wang: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1619203) * /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES-fs-encryption.txt * /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES-fs-encryption.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES-fs-encryption.txt * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt HDFS-6134 and HADOOP-10150 subtasks. Merge fs-encryption branch to trunk. (wang: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1619197) * /hadoop/common/trunk * /hadoop/common/trunk/BUILDING.txt * /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES-fs-encryption.txt * /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt * /hadoop/common/trunk/hadoop-common-project/hadoop-common/pom.xml * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/CMakeLists.txt * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/config.h.cmake * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/docs * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/crypto/AesCtrCryptoCodec.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/crypto/CipherSuite.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/crypto/CryptoCodec.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/crypto/CryptoInputStream.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/crypto/CryptoOutputStream.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/crypto/CryptoStreamUtils.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/crypto/Decryptor.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/crypto/Encryptor.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/crypto/JceAesCtrCryptoCodec.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/crypto/OpensslAesCtrCryptoCodec.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/crypto/OpensslCipher.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/crypto/random * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/CommonConfigurationKeys.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/CommonConfigurationKeysPublic.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/FSDataOutputStream.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/FileEncryptionInfo.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/crypto * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/shell/CommandWithDestination.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/shell/CopyCommands.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/NativeCodeLoader.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/NativeLibraryChecker.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/native/src/org/apache/hadoop/crypto * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/native/src/org/apache/hadoop/util/NativeCodeLoader.c * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/resources/core-default.xml * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/site/apt/FileSystemShell.apt.vm * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/core * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/crypto/CryptoStreamsTestBase.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/crypto/TestCryptoCodec.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test
[jira] [Commented] (HADOOP-10150) Hadoop cryptographic file system
[ https://issues.apache.org/jira/browse/HADOOP-10150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14105556#comment-14105556 ] Hudson commented on HADOOP-10150: - FAILURE: Integrated in Hadoop-Yarn-trunk #653 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/653/]) Fix up CHANGES.txt for HDFS-6134, HADOOP-10150 and related JIRAs. (wang: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1619203) * /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES-fs-encryption.txt * /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES-fs-encryption.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES-fs-encryption.txt * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt HDFS-6134 and HADOOP-10150 subtasks. Merge fs-encryption branch to trunk. (wang: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1619197) * /hadoop/common/trunk * /hadoop/common/trunk/BUILDING.txt * /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES-fs-encryption.txt * /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt * /hadoop/common/trunk/hadoop-common-project/hadoop-common/pom.xml * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/CMakeLists.txt * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/config.h.cmake * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/docs * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/crypto/AesCtrCryptoCodec.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/crypto/CipherSuite.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/crypto/CryptoCodec.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/crypto/CryptoInputStream.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/crypto/CryptoOutputStream.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/crypto/CryptoStreamUtils.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/crypto/Decryptor.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/crypto/Encryptor.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/crypto/JceAesCtrCryptoCodec.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/crypto/OpensslAesCtrCryptoCodec.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/crypto/OpensslCipher.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/crypto/random * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/CommonConfigurationKeys.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/CommonConfigurationKeysPublic.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/FSDataOutputStream.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/FileEncryptionInfo.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/crypto * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/shell/CommandWithDestination.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/shell/CopyCommands.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/NativeCodeLoader.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/NativeLibraryChecker.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/native/src/org/apache/hadoop/crypto * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/native/src/org/apache/hadoop/util/NativeCodeLoader.c * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/resources/core-default.xml * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/site/apt/FileSystemShell.apt.vm * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/core * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/crypto/CryptoStreamsTestBase.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/crypto/TestCryptoCodec.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/j
[jira] [Commented] (HADOOP-10150) Hadoop cryptographic file system
[ https://issues.apache.org/jira/browse/HADOOP-10150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14104857#comment-14104857 ] Yi Liu commented on HADOOP-10150: - Thanks [~andrew.wang]. > Hadoop cryptographic file system > > > Key: HADOOP-10150 > URL: https://issues.apache.org/jira/browse/HADOOP-10150 > Project: Hadoop Common > Issue Type: New Feature > Components: security >Affects Versions: 3.0.0 >Reporter: Yi Liu >Assignee: Yi Liu > Labels: rhino > Fix For: 3.0.0 > > Attachments: CryptographicFileSystem.patch, HADOOP cryptographic file > system-V2.docx, HADOOP cryptographic file system.pdf, > HDFSDataAtRestEncryptionAlternatives.pdf, > HDFSDataatRestEncryptionAttackVectors.pdf, > HDFSDataatRestEncryptionProposal.pdf, cfs.patch, extended information based > on INode feature.patch > > > There is an increasing need for securing data when Hadoop customers use > various upper layer applications, such as Map-Reduce, Hive, Pig, HBase and so > on. > HADOOP CFS (HADOOP Cryptographic File System) is used to secure data, based > on HADOOP “FilterFileSystem” decorating DFS or other file systems, and > transparent to upper layer applications. It’s configurable, scalable and fast. > High level requirements: > 1.Transparent to and no modification required for upper layer > applications. > 2.“Seek”, “PositionedReadable” are supported for input stream of CFS if > the wrapped file system supports them. > 3.Very high performance for encryption and decryption, they will not > become bottleneck. > 4.Can decorate HDFS and all other file systems in Hadoop, and will not > modify existing structure of file system, such as namenode and datanode > structure if the wrapped file system is HDFS. > 5.Admin can configure encryption policies, such as which directory will > be encrypted. > 6.A robust key management framework. > 7.Support Pread and append operations if the wrapped file system supports > them. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10150) Hadoop cryptographic file system
[ https://issues.apache.org/jira/browse/HADOOP-10150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14104631#comment-14104631 ] Hudson commented on HADOOP-10150: - FAILURE: Integrated in Hadoop-trunk-Commit #6090 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/6090/]) Fix up CHANGES.txt for HDFS-6134, HADOOP-10150 and related JIRAs. (wang: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1619203) * /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES-fs-encryption.txt * /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES-fs-encryption.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES-fs-encryption.txt * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt > Hadoop cryptographic file system > > > Key: HADOOP-10150 > URL: https://issues.apache.org/jira/browse/HADOOP-10150 > Project: Hadoop Common > Issue Type: New Feature > Components: security >Affects Versions: 3.0.0 >Reporter: Yi Liu >Assignee: Yi Liu > Labels: rhino > Fix For: 3.0.0 > > Attachments: CryptographicFileSystem.patch, HADOOP cryptographic file > system-V2.docx, HADOOP cryptographic file system.pdf, > HDFSDataAtRestEncryptionAlternatives.pdf, > HDFSDataatRestEncryptionAttackVectors.pdf, > HDFSDataatRestEncryptionProposal.pdf, cfs.patch, extended information based > on INode feature.patch > > > There is an increasing need for securing data when Hadoop customers use > various upper layer applications, such as Map-Reduce, Hive, Pig, HBase and so > on. > HADOOP CFS (HADOOP Cryptographic File System) is used to secure data, based > on HADOOP “FilterFileSystem” decorating DFS or other file systems, and > transparent to upper layer applications. It’s configurable, scalable and fast. > High level requirements: > 1.Transparent to and no modification required for upper layer > applications. > 2.“Seek”, “PositionedReadable” are supported for input stream of CFS if > the wrapped file system supports them. > 3.Very high performance for encryption and decryption, they will not > become bottleneck. > 4.Can decorate HDFS and all other file systems in Hadoop, and will not > modify existing structure of file system, such as namenode and datanode > structure if the wrapped file system is HDFS. > 5.Admin can configure encryption policies, such as which directory will > be encrypted. > 6.A robust key management framework. > 7.Support Pread and append operations if the wrapped file system supports > them. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10150) Hadoop cryptographic file system
[ https://issues.apache.org/jira/browse/HADOOP-10150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14104329#comment-14104329 ] Hudson commented on HADOOP-10150: - FAILURE: Integrated in Hadoop-trunk-Commit #6089 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/6089/]) HDFS-6134 and HADOOP-10150 subtasks. Merge fs-encryption branch to trunk. (wang: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1619197) * /hadoop/common/trunk * /hadoop/common/trunk/BUILDING.txt * /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES-fs-encryption.txt * /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt * /hadoop/common/trunk/hadoop-common-project/hadoop-common/pom.xml * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/CMakeLists.txt * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/config.h.cmake * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/docs * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/crypto/AesCtrCryptoCodec.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/crypto/CipherSuite.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/crypto/CryptoCodec.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/crypto/CryptoInputStream.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/crypto/CryptoOutputStream.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/crypto/CryptoStreamUtils.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/crypto/Decryptor.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/crypto/Encryptor.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/crypto/JceAesCtrCryptoCodec.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/crypto/OpensslAesCtrCryptoCodec.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/crypto/OpensslCipher.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/crypto/random * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/CommonConfigurationKeys.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/CommonConfigurationKeysPublic.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/FSDataOutputStream.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/FileEncryptionInfo.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/crypto * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/shell/CommandWithDestination.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/shell/CopyCommands.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/NativeCodeLoader.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/NativeLibraryChecker.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/native/src/org/apache/hadoop/crypto * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/native/src/org/apache/hadoop/util/NativeCodeLoader.c * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/resources/core-default.xml * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/site/apt/FileSystemShell.apt.vm * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/core * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/crypto/CryptoStreamsTestBase.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/crypto/TestCryptoCodec.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/crypto/TestCryptoStreams.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/crypto/TestCryptoStreamsForLocalFS.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/crypto/TestCryptoStreamsNormal.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/crypto/TestCryptoStreamsWithOpensslAesCtrCryptoCodec.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/crypto/TestOpensslCipher.j
[jira] [Commented] (HADOOP-10150) Hadoop cryptographic file system
[ https://issues.apache.org/jira/browse/HADOOP-10150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13998431#comment-13998431 ] Yi Liu commented on HADOOP-10150: - Thanks [~tucu00] for creating these sub-tasks, let's use them. > Hadoop cryptographic file system > > > Key: HADOOP-10150 > URL: https://issues.apache.org/jira/browse/HADOOP-10150 > Project: Hadoop Common > Issue Type: New Feature > Components: security >Affects Versions: 3.0.0 >Reporter: Yi Liu >Assignee: Yi Liu > Labels: rhino > Fix For: 3.0.0 > > Attachments: CryptographicFileSystem.patch, HADOOP cryptographic file > system-V2.docx, HADOOP cryptographic file system.pdf, > HDFSDataAtRestEncryptionAlternatives.pdf, > HDFSDataatRestEncryptionAttackVectors.pdf, > HDFSDataatRestEncryptionProposal.pdf, cfs.patch, extended information based > on INode feature.patch > > > There is an increasing need for securing data when Hadoop customers use > various upper layer applications, such as Map-Reduce, Hive, Pig, HBase and so > on. > HADOOP CFS (HADOOP Cryptographic File System) is used to secure data, based > on HADOOP “FilterFileSystem” decorating DFS or other file systems, and > transparent to upper layer applications. It’s configurable, scalable and fast. > High level requirements: > 1.Transparent to and no modification required for upper layer > applications. > 2.“Seek”, “PositionedReadable” are supported for input stream of CFS if > the wrapped file system supports them. > 3.Very high performance for encryption and decryption, they will not > become bottleneck. > 4.Can decorate HDFS and all other file systems in Hadoop, and will not > modify existing structure of file system, such as namenode and datanode > structure if the wrapped file system is HDFS. > 5.Admin can configure encryption policies, such as which directory will > be encrypted. > 6.A robust key management framework. > 7.Support Pread and append operations if the wrapped file system supports > them. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10150) Hadoop cryptographic file system
[ https://issues.apache.org/jira/browse/HADOOP-10150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13996861#comment-13996861 ] Alejandro Abdelnur commented on HADOOP-10150: - [~hitliuyi], I've created sub-tasks 7, 8 & 9. They are somehow repeated from existing ones, would you please do a clean up pass and leave the ones that make sense based on the current proposal? > Hadoop cryptographic file system > > > Key: HADOOP-10150 > URL: https://issues.apache.org/jira/browse/HADOOP-10150 > Project: Hadoop Common > Issue Type: New Feature > Components: security >Affects Versions: 3.0.0 >Reporter: Yi Liu >Assignee: Yi Liu > Labels: rhino > Fix For: 3.0.0 > > Attachments: CryptographicFileSystem.patch, HADOOP cryptographic file > system-V2.docx, HADOOP cryptographic file system.pdf, > HDFSDataAtRestEncryptionAlternatives.pdf, > HDFSDataatRestEncryptionAttackVectors.pdf, > HDFSDataatRestEncryptionProposal.pdf, cfs.patch, extended information based > on INode feature.patch > > > There is an increasing need for securing data when Hadoop customers use > various upper layer applications, such as Map-Reduce, Hive, Pig, HBase and so > on. > HADOOP CFS (HADOOP Cryptographic File System) is used to secure data, based > on HADOOP “FilterFileSystem” decorating DFS or other file systems, and > transparent to upper layer applications. It’s configurable, scalable and fast. > High level requirements: > 1.Transparent to and no modification required for upper layer > applications. > 2.“Seek”, “PositionedReadable” are supported for input stream of CFS if > the wrapped file system supports them. > 3.Very high performance for encryption and decryption, they will not > become bottleneck. > 4.Can decorate HDFS and all other file systems in Hadoop, and will not > modify existing structure of file system, such as namenode and datanode > structure if the wrapped file system is HDFS. > 5.Admin can configure encryption policies, such as which directory will > be encrypted. > 6.A robust key management framework. > 7.Support Pread and append operations if the wrapped file system supports > them. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10150) Hadoop cryptographic file system
[ https://issues.apache.org/jira/browse/HADOOP-10150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13995305#comment-13995305 ] Alejandro Abdelnur commented on HADOOP-10150: - [cross-posting with HDFS-6134] Reopening HDFS-6134 After some offline discussions with Yi, Tianyou, ATM, Todd, Andrew and Charles we think is makes more sense to implement encryption for HDFS directly into the DistributedFileSystem client and to use CryptoFileSystem support encryption for FileSystems that don’t support native encryption. The reasons for this change of course are: * If we want to we add support for HDFS transparent compression, the compression should be done before the encryption (implying less entropy). If compression is to be handled by HDFS DistributedFileSystem, then the encryption has to be handled afterwards (in the write path). * The proposed CryptoSupport abstraction significantly complicates the implementation of CryptoFileSystem and the wiring in HDFS FileSystem client. * Building it directly into HDFS FileSystem client may allow us to avoid an extra copy of data. Because of this, the idea is now: * A common set of Crypto Input/Output streams. They would be used by CryptoFileSystem, HDFS encryption, MapReduce intermediate data and spills. Note we cannot use the JDK Cipher Input/Output streams directly because we need to support the additional interfaces that the Hadoop FileSystem streams implement (Seekable, PositionedReadable, ByteBufferReadable, HasFileDescriptor, CanSetDropBehind, CanSetReadahead, HasEnhancedByteBufferAccess, Syncable, CanSetDropBehind). * CryptoFileSystem. To support encryption in arbitrary FileSystems. * HDFS client encryption. To support transparent HDFS encryption. Both CryptoFilesystem and HDFS client encryption implementations would be built using the Crypto Input/Output streams, xAttributes and KeyProvider API. > Hadoop cryptographic file system > > > Key: HADOOP-10150 > URL: https://issues.apache.org/jira/browse/HADOOP-10150 > Project: Hadoop Common > Issue Type: New Feature > Components: security >Affects Versions: 3.0.0 >Reporter: Yi Liu >Assignee: Yi Liu > Labels: rhino > Fix For: 3.0.0 > > Attachments: CryptographicFileSystem.patch, HADOOP cryptographic file > system-V2.docx, HADOOP cryptographic file system.pdf, > HDFSDataAtRestEncryptionAlternatives.pdf, > HDFSDataatRestEncryptionAttackVectors.pdf, > HDFSDataatRestEncryptionProposal.pdf, cfs.patch, extended information based > on INode feature.patch > > > There is an increasing need for securing data when Hadoop customers use > various upper layer applications, such as Map-Reduce, Hive, Pig, HBase and so > on. > HADOOP CFS (HADOOP Cryptographic File System) is used to secure data, based > on HADOOP “FilterFileSystem” decorating DFS or other file systems, and > transparent to upper layer applications. It’s configurable, scalable and fast. > High level requirements: > 1.Transparent to and no modification required for upper layer > applications. > 2.“Seek”, “PositionedReadable” are supported for input stream of CFS if > the wrapped file system supports them. > 3.Very high performance for encryption and decryption, they will not > become bottleneck. > 4.Can decorate HDFS and all other file systems in Hadoop, and will not > modify existing structure of file system, such as namenode and datanode > structure if the wrapped file system is HDFS. > 5.Admin can configure encryption policies, such as which directory will > be encrypted. > 6.A robust key management framework. > 7.Support Pread and append operations if the wrapped file system supports > them. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10150) Hadoop cryptographic file system
[ https://issues.apache.org/jira/browse/HADOOP-10150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13987655#comment-13987655 ] Yi Liu commented on HADOOP-10150: - Steve, thank you for the comments. About blobstores, I remember you brought out this before, it's very good and can inspire us. A few shortcomings are 1) it's a third part and standalone service, increase deployment and management. 2) authentication/authorization issue, for example integration and management. 3) rely on maturity of blobstores. I'm not saying it's not good, just compared to xattr, the latter has more merits. {quote} generates a per-file key, encrypts it in the public key of users and admins. {quote} Agree, having two layer keys is necessary, for example convenient for key rotation. {quote} It'd be good if the mechanism to store/retrieve keys worked with all the filesystems -even if they didn't have full xattr support. Maybe this could be done if the design supported a very simple adapter for each FS, which only handled the read/write of crypto keys {quote} Agree, support all the filesystems is one target, and decouple is a basic rule for programing. For xattr, it is widely supported on different OS/FS. If some underlying file system doesn't have full xattr support, we can have fallback in different ways, but the interfaces can be xattr interfaces, just the implementation, one way is to use a blobstores. This will guarantee CFS works efficiently and easily management on top of most filesystems if they support xattr. > Hadoop cryptographic file system > > > Key: HADOOP-10150 > URL: https://issues.apache.org/jira/browse/HADOOP-10150 > Project: Hadoop Common > Issue Type: New Feature > Components: security >Affects Versions: 3.0.0 >Reporter: Yi Liu >Assignee: Yi Liu > Labels: rhino > Fix For: 3.0.0 > > Attachments: CryptographicFileSystem.patch, HADOOP cryptographic file > system-V2.docx, HADOOP cryptographic file system.pdf, > HDFSDataAtRestEncryptionAlternatives.pdf, > HDFSDataatRestEncryptionAttackVectors.pdf, > HDFSDataatRestEncryptionProposal.pdf, cfs.patch, extended information based > on INode feature.patch > > > There is an increasing need for securing data when Hadoop customers use > various upper layer applications, such as Map-Reduce, Hive, Pig, HBase and so > on. > HADOOP CFS (HADOOP Cryptographic File System) is used to secure data, based > on HADOOP “FilterFileSystem” decorating DFS or other file systems, and > transparent to upper layer applications. It’s configurable, scalable and fast. > High level requirements: > 1.Transparent to and no modification required for upper layer > applications. > 2.“Seek”, “PositionedReadable” are supported for input stream of CFS if > the wrapped file system supports them. > 3.Very high performance for encryption and decryption, they will not > become bottleneck. > 4.Can decorate HDFS and all other file systems in Hadoop, and will not > modify existing structure of file system, such as namenode and datanode > structure if the wrapped file system is HDFS. > 5.Admin can configure encryption policies, such as which directory will > be encrypted. > 6.A robust key management framework. > 7.Support Pread and append operations if the wrapped file system supports > them. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10150) Hadoop cryptographic file system
[ https://issues.apache.org/jira/browse/HADOOP-10150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13987627#comment-13987627 ] Yi Liu commented on HADOOP-10150: - Owen, thanks a lot for the comments and ideas. Thanks Andrew for the explanation too. {quote} We have two metadata items that we need for each file: the key name and version the iv Note that the current patches only store the iv, but we really need to store the key name and version. The version is absolutely critical because if you roll a new key version you don't want to re-write all of the current data. {quote} Right, I agree. It's also included in latest doc posted by [~tucu00]. {quote} It seems to me there are three reasonable places to store the small amount of metadata: at the beginning of the file in a side file encoded using a filename mangling scheme {quote} - At the beginning of the file, as you said, it has some weakness and we did use this way in the earliest patch and you also commented that it was not good enough. - A side file, does double the amount of traffic and storage. - Encoded using a filename mangling schema, you have brought out this idea in previous comment, and I did think about this carefully and tried, but I found few issues: It can do transformation one way, that means we can create crypto file and do encryption easily since we know the IV/Key, then we can encoded them to file name easily; but there is problem while decrypting, when upper layer application reads a file which is transparent encrypted, it's hard for crypto file system mapping it to the encoded file name, since crypto file system doesn't know the IV/key, one possible way is to iterate the directory to get possible file name and it's not efficient and mapping is not accurate enough. Furthermore, the crypto file name is longer than original one and different, it may be not good well in some case. As Andrew explained, we will use the extended attributes feature of filesystem(HDFS-2006), which is common feature in traditional OS/FS, suitable to store extended information of file/directory, especially for short attributes like small amount of crypto metadata. > Hadoop cryptographic file system > > > Key: HADOOP-10150 > URL: https://issues.apache.org/jira/browse/HADOOP-10150 > Project: Hadoop Common > Issue Type: New Feature > Components: security >Affects Versions: 3.0.0 >Reporter: Yi Liu >Assignee: Yi Liu > Labels: rhino > Fix For: 3.0.0 > > Attachments: CryptographicFileSystem.patch, HADOOP cryptographic file > system-V2.docx, HADOOP cryptographic file system.pdf, > HDFSDataAtRestEncryptionAlternatives.pdf, > HDFSDataatRestEncryptionAttackVectors.pdf, > HDFSDataatRestEncryptionProposal.pdf, cfs.patch, extended information based > on INode feature.patch > > > There is an increasing need for securing data when Hadoop customers use > various upper layer applications, such as Map-Reduce, Hive, Pig, HBase and so > on. > HADOOP CFS (HADOOP Cryptographic File System) is used to secure data, based > on HADOOP “FilterFileSystem” decorating DFS or other file systems, and > transparent to upper layer applications. It’s configurable, scalable and fast. > High level requirements: > 1.Transparent to and no modification required for upper layer > applications. > 2.“Seek”, “PositionedReadable” are supported for input stream of CFS if > the wrapped file system supports them. > 3.Very high performance for encryption and decryption, they will not > become bottleneck. > 4.Can decorate HDFS and all other file systems in Hadoop, and will not > modify existing structure of file system, such as namenode and datanode > structure if the wrapped file system is HDFS. > 5.Admin can configure encryption policies, such as which directory will > be encrypted. > 6.A robust key management framework. > 7.Support Pread and append operations if the wrapped file system supports > them. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10150) Hadoop cryptographic file system
[ https://issues.apache.org/jira/browse/HADOOP-10150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13987572#comment-13987572 ] Steve Loughran commented on HADOOP-10150: - the blobstores normally support some form of metadata, which could be used for the data, as do things like NTFS, HDFS+. Indeed, this is how NTFS encryption works: generates a per-file key, encrypts it in the public key of users and admins, attaches them all as independent metadata entries. It'd be good if the mechanism to store/retrieve keys worked with all the filesystems -even if they didn't have full xattr support. Maybe this could be done if the design supported a very simple adapter for each FS, which only handled the read/write of crypto keys > Hadoop cryptographic file system > > > Key: HADOOP-10150 > URL: https://issues.apache.org/jira/browse/HADOOP-10150 > Project: Hadoop Common > Issue Type: New Feature > Components: security >Affects Versions: 3.0.0 >Reporter: Yi Liu >Assignee: Yi Liu > Labels: rhino > Fix For: 3.0.0 > > Attachments: CryptographicFileSystem.patch, HADOOP cryptographic file > system-V2.docx, HADOOP cryptographic file system.pdf, > HDFSDataAtRestEncryptionAlternatives.pdf, > HDFSDataatRestEncryptionAttackVectors.pdf, > HDFSDataatRestEncryptionProposal.pdf, cfs.patch, extended information based > on INode feature.patch > > > There is an increasing need for securing data when Hadoop customers use > various upper layer applications, such as Map-Reduce, Hive, Pig, HBase and so > on. > HADOOP CFS (HADOOP Cryptographic File System) is used to secure data, based > on HADOOP “FilterFileSystem” decorating DFS or other file systems, and > transparent to upper layer applications. It’s configurable, scalable and fast. > High level requirements: > 1.Transparent to and no modification required for upper layer > applications. > 2.“Seek”, “PositionedReadable” are supported for input stream of CFS if > the wrapped file system supports them. > 3.Very high performance for encryption and decryption, they will not > become bottleneck. > 4.Can decorate HDFS and all other file systems in Hadoop, and will not > modify existing structure of file system, such as namenode and datanode > structure if the wrapped file system is HDFS. > 5.Admin can configure encryption policies, such as which directory will > be encrypted. > 6.A robust key management framework. > 7.Support Pread and append operations if the wrapped file system supports > them. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10150) Hadoop cryptographic file system
[ https://issues.apache.org/jira/browse/HADOOP-10150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13987025#comment-13987025 ] Andrew Wang commented on HADOOP-10150: -- Hey Owen, I think the plan here is to use xattrs to store this additional data. Is that satisfactory? This means it wouldn't be a pure wrapper since it'd require the underlying filesystem to implement xattrs (HDFS-2006 is linked as "requires"). The upside is that the design is nicer, and we can do tighter integration with HDFS. > Hadoop cryptographic file system > > > Key: HADOOP-10150 > URL: https://issues.apache.org/jira/browse/HADOOP-10150 > Project: Hadoop Common > Issue Type: New Feature > Components: security >Affects Versions: 3.0.0 >Reporter: Yi Liu >Assignee: Yi Liu > Labels: rhino > Fix For: 3.0.0 > > Attachments: CryptographicFileSystem.patch, HADOOP cryptographic file > system-V2.docx, HADOOP cryptographic file system.pdf, > HDFSDataAtRestEncryptionAlternatives.pdf, > HDFSDataatRestEncryptionAttackVectors.pdf, > HDFSDataatRestEncryptionProposal.pdf, cfs.patch, extended information based > on INode feature.patch > > > There is an increasing need for securing data when Hadoop customers use > various upper layer applications, such as Map-Reduce, Hive, Pig, HBase and so > on. > HADOOP CFS (HADOOP Cryptographic File System) is used to secure data, based > on HADOOP “FilterFileSystem” decorating DFS or other file systems, and > transparent to upper layer applications. It’s configurable, scalable and fast. > High level requirements: > 1.Transparent to and no modification required for upper layer > applications. > 2.“Seek”, “PositionedReadable” are supported for input stream of CFS if > the wrapped file system supports them. > 3.Very high performance for encryption and decryption, they will not > become bottleneck. > 4.Can decorate HDFS and all other file systems in Hadoop, and will not > modify existing structure of file system, such as namenode and datanode > structure if the wrapped file system is HDFS. > 5.Admin can configure encryption policies, such as which directory will > be encrypted. > 6.A robust key management framework. > 7.Support Pread and append operations if the wrapped file system supports > them. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10150) Hadoop cryptographic file system
[ https://issues.apache.org/jira/browse/HADOOP-10150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13987015#comment-13987015 ] Owen O'Malley commented on HADOOP-10150: I've been working through this. We have two metadata items that we need for each file: * the key name and version * the iv Note that the current patches only store the iv, but we really need to store the key name and version. The version is absolutely critical because if you roll a new key version you don't want to re-write all of the current data. It seems to me there are three reasonable places to store the small amount of metadata: * at the beginning of the file * in a side file * encoded using a filename mangling scheme The beginning of the file creates trouble because it throws off the block calculations that are done by mapreduce. (In other words, if we slide all of the data down by 1k, then each input split will always cross HDFS block boundaries.) On the other hand, it doesn't add any load to the namenode and will always be consistent with the file. A side file doesn't change the offsets into the file, but does double the amount of traffic and storage required on the namenode. Doing name mangling means the underlying HDFS file names are more complicated, but it doesn't mess with either the file offsets or increase the load on the namenode. I think we should do the name mangling. What do others think? > Hadoop cryptographic file system > > > Key: HADOOP-10150 > URL: https://issues.apache.org/jira/browse/HADOOP-10150 > Project: Hadoop Common > Issue Type: New Feature > Components: security >Affects Versions: 3.0.0 >Reporter: Yi Liu >Assignee: Yi Liu > Labels: rhino > Fix For: 3.0.0 > > Attachments: CryptographicFileSystem.patch, HADOOP cryptographic file > system-V2.docx, HADOOP cryptographic file system.pdf, > HDFSDataAtRestEncryptionAlternatives.pdf, > HDFSDataatRestEncryptionAttackVectors.pdf, > HDFSDataatRestEncryptionProposal.pdf, cfs.patch, extended information based > on INode feature.patch > > > There is an increasing need for securing data when Hadoop customers use > various upper layer applications, such as Map-Reduce, Hive, Pig, HBase and so > on. > HADOOP CFS (HADOOP Cryptographic File System) is used to secure data, based > on HADOOP “FilterFileSystem” decorating DFS or other file systems, and > transparent to upper layer applications. It’s configurable, scalable and fast. > High level requirements: > 1.Transparent to and no modification required for upper layer > applications. > 2.“Seek”, “PositionedReadable” are supported for input stream of CFS if > the wrapped file system supports them. > 3.Very high performance for encryption and decryption, they will not > become bottleneck. > 4.Can decorate HDFS and all other file systems in Hadoop, and will not > modify existing structure of file system, such as namenode and datanode > structure if the wrapped file system is HDFS. > 5.Admin can configure encryption policies, such as which directory will > be encrypted. > 6.A robust key management framework. > 7.Support Pread and append operations if the wrapped file system supports > them. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10150) Hadoop cryptographic file system
[ https://issues.apache.org/jira/browse/HADOOP-10150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13975873#comment-13975873 ] Andrew Purtell commented on HADOOP-10150: - bq. there's one more layer to consider: virtualized hadoop clusters. An interesting paper on this topic is http://eprint.iacr.org/2014/248.pdf, which discusses side channel attacks on AES on Xen and VMWare platforms. JCE ciphers were not included in the analysis but should be suspect until proven otherwise. JRE >= 8 will accelerate AES using AES-NI instructions. Since AES-NI performs each full round of AES in a hardware register all known side channel attacks are prevented. > Hadoop cryptographic file system > > > Key: HADOOP-10150 > URL: https://issues.apache.org/jira/browse/HADOOP-10150 > Project: Hadoop Common > Issue Type: New Feature > Components: security >Affects Versions: 3.0.0 >Reporter: Yi Liu >Assignee: Yi Liu > Labels: rhino > Fix For: 3.0.0 > > Attachments: CryptographicFileSystem.patch, HADOOP cryptographic file > system-V2.docx, HADOOP cryptographic file system.pdf, > HDFSDataAtRestEncryptionAlternatives.pdf, > HDFSDataatRestEncryptionAttackVectors.pdf, > HDFSDataatRestEncryptionProposal.pdf, cfs.patch, extended information based > on INode feature.patch > > > There is an increasing need for securing data when Hadoop customers use > various upper layer applications, such as Map-Reduce, Hive, Pig, HBase and so > on. > HADOOP CFS (HADOOP Cryptographic File System) is used to secure data, based > on HADOOP “FilterFileSystem” decorating DFS or other file systems, and > transparent to upper layer applications. It’s configurable, scalable and fast. > High level requirements: > 1.Transparent to and no modification required for upper layer > applications. > 2.“Seek”, “PositionedReadable” are supported for input stream of CFS if > the wrapped file system supports them. > 3.Very high performance for encryption and decryption, they will not > become bottleneck. > 4.Can decorate HDFS and all other file systems in Hadoop, and will not > modify existing structure of file system, such as namenode and datanode > structure if the wrapped file system is HDFS. > 5.Admin can configure encryption policies, such as which directory will > be encrypted. > 6.A robust key management framework. > 7.Support Pread and append operations if the wrapped file system supports > them. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10150) Hadoop cryptographic file system
[ https://issues.apache.org/jira/browse/HADOOP-10150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13971645#comment-13971645 ] Alejandro Abdelnur commented on HADOOP-10150: - Steve, that's good, we should make sure that ends up in the final documentation as well. > Hadoop cryptographic file system > > > Key: HADOOP-10150 > URL: https://issues.apache.org/jira/browse/HADOOP-10150 > Project: Hadoop Common > Issue Type: New Feature > Components: security >Affects Versions: 3.0.0 >Reporter: Yi Liu >Assignee: Yi Liu > Labels: rhino > Fix For: 3.0.0 > > Attachments: CryptographicFileSystem.patch, HADOOP cryptographic file > system-V2.docx, HADOOP cryptographic file system.pdf, > HDFSDataAtRestEncryptionAlternatives.pdf, > HDFSDataatRestEncryptionAttackVectors.pdf, > HDFSDataatRestEncryptionProposal.pdf, cfs.patch, extended information based > on INode feature.patch > > > There is an increasing need for securing data when Hadoop customers use > various upper layer applications, such as Map-Reduce, Hive, Pig, HBase and so > on. > HADOOP CFS (HADOOP Cryptographic File System) is used to secure data, based > on HADOOP “FilterFileSystem” decorating DFS or other file systems, and > transparent to upper layer applications. It’s configurable, scalable and fast. > High level requirements: > 1.Transparent to and no modification required for upper layer > applications. > 2.“Seek”, “PositionedReadable” are supported for input stream of CFS if > the wrapped file system supports them. > 3.Very high performance for encryption and decryption, they will not > become bottleneck. > 4.Can decorate HDFS and all other file systems in Hadoop, and will not > modify existing structure of file system, such as namenode and datanode > structure if the wrapped file system is HDFS. > 5.Admin can configure encryption policies, such as which directory will > be encrypted. > 6.A robust key management framework. > 7.Support Pread and append operations if the wrapped file system supports > them. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10150) Hadoop cryptographic file system
[ https://issues.apache.org/jira/browse/HADOOP-10150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13970602#comment-13970602 ] Steve Loughran commented on HADOOP-10150: - I like the document on attack vectors, including that on hardware and networking. If we're going down to that level, there's one more layer to consider: virtualized hadoop clusters. # even don't swap Memory could be swapped out by the host OS # pagefile secrets could be preserved after VM destruction # disks may not be wiped Fixes # Don't give a transient cluster access to keys needed to decrypt persistent data other than that needed by specific jobs # explore with your virtualization/cloud service provider what their VM and virtual disk security policies are: when do the virtual disks get wiped, and how rigorously. Other things to worry about # malicious DNs joining the cluster. Again, it's hard to block this in a cloud, as hostnames aren't known in advance (so you cant have them on the included host list). Fix: Use a VPN and not any datacentre-wide network. # fundamental security holes in core dependency libraries (OS & JVM layer). Keep your machines up to date, have mechanisms for renewing anbd revoking certificates,... > Hadoop cryptographic file system > > > Key: HADOOP-10150 > URL: https://issues.apache.org/jira/browse/HADOOP-10150 > Project: Hadoop Common > Issue Type: New Feature > Components: security >Affects Versions: 3.0.0 >Reporter: Yi Liu >Assignee: Yi Liu > Labels: rhino > Fix For: 3.0.0 > > Attachments: CryptographicFileSystem.patch, HADOOP cryptographic file > system-V2.docx, HADOOP cryptographic file system.pdf, > HDFSDataAtRestEncryptionAlternatives.pdf, > HDFSDataatRestEncryptionAttackVectors.pdf, > HDFSDataatRestEncryptionProposal.pdf, cfs.patch, extended information based > on INode feature.patch > > > There is an increasing need for securing data when Hadoop customers use > various upper layer applications, such as Map-Reduce, Hive, Pig, HBase and so > on. > HADOOP CFS (HADOOP Cryptographic File System) is used to secure data, based > on HADOOP “FilterFileSystem” decorating DFS or other file systems, and > transparent to upper layer applications. It’s configurable, scalable and fast. > High level requirements: > 1.Transparent to and no modification required for upper layer > applications. > 2.“Seek”, “PositionedReadable” are supported for input stream of CFS if > the wrapped file system supports them. > 3.Very high performance for encryption and decryption, they will not > become bottleneck. > 4.Can decorate HDFS and all other file systems in Hadoop, and will not > modify existing structure of file system, such as namenode and datanode > structure if the wrapped file system is HDFS. > 5.Admin can configure encryption policies, such as which directory will > be encrypted. > 6.A robust key management framework. > 7.Support Pread and append operations if the wrapped file system supports > them. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10150) Hadoop cryptographic file system
[ https://issues.apache.org/jira/browse/HADOOP-10150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13966140#comment-13966140 ] Yi Liu commented on HADOOP-10150: - Todd, thanks for your comments. {quote}A few questions here... First, let me confirm my understanding of the key structure and storage: Client master key: this lives on the Key Management Server, and might be different from application to application. {quote} Yes. {quote}In many cases there may be just one per cluster, though in a multitenant cluster, perhaps we could have one per tenant.{quote} It depends on the KeyProvider implementation, these kinds of details can be encapsulated into the KeyProvider implementation which could be pluggable in CFS. Thus, customer can use their own strategy to deploy one master key or multiple master key, by application or by user-group etc. {quote}Data key: this is set per encrypted directory. This key is stored in the directory xattr on the NN, but encrypted by the client master key (which the NN doesn't know).{quote} Yes. {quote}So, when a client wants to read a file, the following is the process: 1) Notices that the file is in an encrypted directory. Fetches the encrypted data key from the NN's xattr on the directory. 2) Somehow associates this encrypted data key with the master key that was used to encrypt it (perhaps it's tagged with some identifier). > Fetches the appropriate master key from the key store. 2a) The keystore somehow authenticates and authorizes the client's access to this key 3) The client decrypts the data key using the master key, and is now able to set up a decrypting stream for the file itself. (I've ignored the IV here, but assume it's also stored in an xattr) {quote} Yes. {quote}In terms of attack vectors: let's say that the NN disk is stolen. The thief now has access to a bunch of keys, but they're all encrypted by various master keys. So we're OK.{quote} Yes. {quote}let's say that a client is malicious. It can get whichever master keys it has access to from the KMS. If we only have one master key per cluster, then the combination of one malicious client plus stealing the fsimage will give up all the keys{quote} When a client get access to master key and fsimage, there is nothing we can do to protected those data. The separation of data encryption key and master key is for master key rotation. So that one does not need to decrypt all data file then encrypt it with new encryption key again. {quote}let's say that a client has escalated to root access on one of the slave nodes in the cluster, or otherwise has malicious access to a NodeManager process. By looking at a running MR task, it could steal whatever credentials the task is using to access the KMS, and/or dump the memory of the client process in order to give up the master key above.{quote} When a client has root access, all information can be dumped from any process, right? I remember Nicholas asked the similar question on HDFS-6134. If a client has escalated to root access on slave nodes, how can we assume the namenode, standby namenode/secondary namenode are secure in the same cluster? On the other hand, as long as data keys remain in encrypted form in the process memory of the NameNode and DataNodes, and they don't have access to the wrapping keys, then there is no attack vector there. {quote}How does the MR task in this context get the credentials to fetch keys from the KMS? If the KMS accepts the same authentication tokens as the NameNode, then is there any reason that this is more secure than having the NameNode supply the keys? Or is it just that decoupling the NameNode and the key server allows this approach to work for non-HDFS filesystems, at the expense of an additional daemon running a key distribution service?{quote} It is a good question. Securely distributing the secrets as you mentioned among the cluster nodes will always be a hard problem to solve. Without adequate hardware support, it could possibly be a weak point during operations like unwrapping key. We want to leave options to KeyProvider implementation to decouple the key protection mechanism and data encryption mechanism, and to make above two work on top of any filesystem. It is possible to have a KeyProvider implementation which use NN as KMS as we already discussed, and leave room for other parties to plug their own solution? > Hadoop cryptographic file system > > > Key: HADOOP-10150 > URL: https://issues.apache.org/jira/browse/HADOOP-10150 > Project: Hadoop Common > Issue Type: New Feature > Components: security >Affects Versions: 3.0.0 >Reporter: Yi Liu >Assignee: Yi Liu > Labels: rhino > Fix For: 3.0.0 > > Attachments: CryptographicFileSystem.p
[jira] [Commented] (HADOOP-10150) Hadoop cryptographic file system
[ https://issues.apache.org/jira/browse/HADOOP-10150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13959123#comment-13959123 ] Todd Lipcon commented on HADOOP-10150: -- A few questions here... First, let me confirm my understanding of the key structure and storage: - Client master key: this lives on the Key Management Server, and might be different from application to application. In many cases there may be just one per cluster, though in a multitenant cluster, perhaps we could have one per tenant. - Data key: this is set per encrypted directory. This key is stored in the directory xattr on the NN, but encrypted by the client master key (which the NN doesn't know). So, when a client wants to read a file, the following is the process: 1) Notices that the file is in an encrypted directory. Fetches the encrypted data key from the NN's xattr on the directory. 2) Somehow associates this encrypted data key with the master key that was used to encrypt it (perhaps it's tagged with some identifier). Fetches the appropriate master key from the key store. 2a) The keystore somehow authenticates and authorizes the client's access to this key 3) The client decrypts the data key using the master key, and is now able to set up a decrypting stream for the file itself. (I've ignored the IV here, but assume it's also stored in an xattr) In terms of attack vectors: - let's say that the NN disk is stolen. The thief now has access to a bunch of keys, but they're all encrypted by various master keys. So we're OK. - let's say that a client is malicious. It can get whichever master keys it has access to from the KMS. If we only have one master key per cluster, then the combination of one malicious client plus stealing the fsimage will give up all the keys - let's say that a client has escalated to root access on one of the slave nodes in the cluster, or otherwise has malicious access to a NodeManager process. By looking at a running MR task, it could steal whatever credentials the task is using to access the KMS, and/or dump the memory of the client process in order to give up the master key above. Does the above look right? It would be nice to add to the design doc a clear description of the threat model here. Do we assume that the adversary will never have root on the cluster? Do we assume the adversary won't have access to the "mapred" user (or whoever runs the NM?) How does the MR task in this context get the credentials to fetch keys from the KMS? If the KMS accepts the same authentication tokens as the NameNode, then is there any reason that this is more secure than having the NameNode supply the keys? Or is it just that decoupling the NameNode and the key server allows this approach to work for non-HDFS filesystems, at the expense of an additional daemon running a key distribution service? > Hadoop cryptographic file system > > > Key: HADOOP-10150 > URL: https://issues.apache.org/jira/browse/HADOOP-10150 > Project: Hadoop Common > Issue Type: New Feature > Components: security >Affects Versions: 3.0.0 >Reporter: Yi Liu >Assignee: Yi Liu > Labels: rhino > Fix For: 3.0.0 > > Attachments: CryptographicFileSystem.patch, HADOOP cryptographic file > system-V2.docx, HADOOP cryptographic file system.pdf, cfs.patch, extended > information based on INode feature.patch > > > There is an increasing need for securing data when Hadoop customers use > various upper layer applications, such as Map-Reduce, Hive, Pig, HBase and so > on. > HADOOP CFS (HADOOP Cryptographic File System) is used to secure data, based > on HADOOP “FilterFileSystem” decorating DFS or other file systems, and > transparent to upper layer applications. It’s configurable, scalable and fast. > High level requirements: > 1.Transparent to and no modification required for upper layer > applications. > 2.“Seek”, “PositionedReadable” are supported for input stream of CFS if > the wrapped file system supports them. > 3.Very high performance for encryption and decryption, they will not > become bottleneck. > 4.Can decorate HDFS and all other file systems in Hadoop, and will not > modify existing structure of file system, such as namenode and datanode > structure if the wrapped file system is HDFS. > 5.Admin can configure encryption policies, such as which directory will > be encrypted. > 6.A robust key management framework. > 7.Support Pread and append operations if the wrapped file system supports > them. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10150) Hadoop cryptographic file system
[ https://issues.apache.org/jira/browse/HADOOP-10150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13954994#comment-13954994 ] Yi Liu commented on HADOOP-10150: - [~tucu00], thanks for comments. {quote}Regarding hflush, hsync. Unless I’m missing something, if the hflush/hsync is done at an offset which is not MOD of 16, things will break as the IV advancing is done on per encryption block (16 bytes).{quote} Hflush/Hsync will work well in CFS. The key point is in CTR mode, it could have some characteristics of stream cipher, such like encryption can be done for any size of data, and we can decrypt any random bytes, counter is calculated using the formula in our design doc. {quote}The Cfs.getDataKey(), it is not clear how the master key is to be fetched by clients and by job tasks. Plus, it seems that the idea is that every client job task will get hold of the master key (which could decrypt all stored keys). {quote} cfs.getDataKey() could be refactored to use Owen’s HADOOP-10141 key provider interface, thus decouple with the underlying KMS system. In the patch attached, we’d like to show the master key which served from client side could be used to decrypt the data encryption key. This client master key could be different from user to user. The master key can be retrieved from KMS as well and served via Owen’s HADOOP-10141 key provider interface as well, and it is pluggable and end user can provide his own implementation. The similar approach can be seen from Hadoop-9333 and MAPREDUCE-4491 which we have quite a lot discussion with @Benoy Antony. {quote}Also, there is no provision to allow master key rotation.{quote} Since the client master key is controlled by client, client is responsible for the key rotation. > Hadoop cryptographic file system > > > Key: HADOOP-10150 > URL: https://issues.apache.org/jira/browse/HADOOP-10150 > Project: Hadoop Common > Issue Type: New Feature > Components: security >Affects Versions: 3.0.0 >Reporter: Yi Liu >Assignee: Yi Liu > Labels: rhino > Fix For: 3.0.0 > > Attachments: CryptographicFileSystem.patch, HADOOP cryptographic file > system-V2.docx, HADOOP cryptographic file system.pdf, cfs.patch, extended > information based on INode feature.patch > > > There is an increasing need for securing data when Hadoop customers use > various upper layer applications, such as Map-Reduce, Hive, Pig, HBase and so > on. > HADOOP CFS (HADOOP Cryptographic File System) is used to secure data, based > on HADOOP “FilterFileSystem” decorating DFS or other file systems, and > transparent to upper layer applications. It’s configurable, scalable and fast. > High level requirements: > 1.Transparent to and no modification required for upper layer > applications. > 2.“Seek”, “PositionedReadable” are supported for input stream of CFS if > the wrapped file system supports them. > 3.Very high performance for encryption and decryption, they will not > become bottleneck. > 4.Can decorate HDFS and all other file systems in Hadoop, and will not > modify existing structure of file system, such as namenode and datanode > structure if the wrapped file system is HDFS. > 5.Admin can configure encryption policies, such as which directory will > be encrypted. > 6.A robust key management framework. > 7.Support Pread and append operations if the wrapped file system supports > them. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10150) Hadoop cryptographic file system
[ https://issues.apache.org/jira/browse/HADOOP-10150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13948589#comment-13948589 ] Alejandro Abdelnur commented on HADOOP-10150: - [~hitliuyi], thanks for the detailed answers. I’ll answer in more detail later, just a couple of things now that jumped out after a quick look at the patches. I like the use of xAttr. Regarding hflush, hsync. Unless I’m missing something, if the hflush/hsync is done at an offset which is not MOD of 16, things will break as the IV advancing is done on per encryption block (16 bytes). The Cfs.getDataKey(), it is not clear how the master key is to be fetched by clients and by job tasks. Plus, it seems that the idea is that every client job task will get hold of the master key (which could decrypt all stored keys). Also, there is no provision to allow master key rotation. More later. > Hadoop cryptographic file system > > > Key: HADOOP-10150 > URL: https://issues.apache.org/jira/browse/HADOOP-10150 > Project: Hadoop Common > Issue Type: New Feature > Components: security >Affects Versions: 3.0.0 >Reporter: Yi Liu >Assignee: Yi Liu > Labels: rhino > Fix For: 3.0.0 > > Attachments: CryptographicFileSystem.patch, HADOOP cryptographic file > system-V2.docx, HADOOP cryptographic file system.pdf, cfs.patch, extended > information based on INode feature.patch > > > There is an increasing need for securing data when Hadoop customers use > various upper layer applications, such as Map-Reduce, Hive, Pig, HBase and so > on. > HADOOP CFS (HADOOP Cryptographic File System) is used to secure data, based > on HADOOP “FilterFileSystem” decorating DFS or other file systems, and > transparent to upper layer applications. It’s configurable, scalable and fast. > High level requirements: > 1.Transparent to and no modification required for upper layer > applications. > 2.“Seek”, “PositionedReadable” are supported for input stream of CFS if > the wrapped file system supports them. > 3.Very high performance for encryption and decryption, they will not > become bottleneck. > 4.Can decorate HDFS and all other file systems in Hadoop, and will not > modify existing structure of file system, such as namenode and datanode > structure if the wrapped file system is HDFS. > 5.Admin can configure encryption policies, such as which directory will > be encrypted. > 6.A robust key management framework. > 7.Support Pread and append operations if the wrapped file system supports > them. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10150) Hadoop cryptographic file system
[ https://issues.apache.org/jira/browse/HADOOP-10150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13946828#comment-13946828 ] Yi Liu commented on HADOOP-10150: - Thanks [~tucu00] for your comment. We less concern the internal use of HDFS client, on the contrary we care more about encrypted data easy for clients. Even though we found that in webhdfs it should use DistributedFileSystem as well to remove the symlink issue as HDFS-4933 stated(The issue we found is “Throwing UnresolvedPathException when getting HDFS symlink file through HDFS REST API”, and there is no “statistics” for HDFS REST which is inconsistent with behavior of DistributedFileSystem, suppose this JIRA will resolve it). “Transparent” or “at rest” encryption usually means that the server handles encrypting data for persistence, but does not manage keys for particular clients or applications, nor require applications to even be aware that encryption is in use. Hence how it can be described as transparent. This type of solution distributes secret keys within the secure enclave (not to clients), or might employ a two tier key architecture (data keys wrapped by the cluster secret key) but with keys managed per application typically. E.g. in a database system, per table. The goal here is to avoid data leakage from the server by universally encrypting data “at rest”. Other cryptographic application architectures handle use cases where clients or applications want to protect data with encryption from other clients or applications. For those use cases encryption and decryption is done on the client, and the scope of key sharing should be minimized to where the cryptographic operations take place. In this type of solution the server becomes an unnecessary central point of compromise for user or application keys, so sharing there should be avoided. This isn’t really an “at rest” solution because the client may or may not choose to encrypt, and because key sharing is minimized, the server cannot and should not be able to distinguish encrypted data from random bytes, so cannot guarantee all persisted data is encrypted. Therefore we have two different types of solutions useful for different reasons, with different threat models. Combinations of the two must be carefully done (or avoided) so as not to end up with something combining the worst of both threat models. HDFS-6134 and HADOOP-10150 are orthogonal and complimentary solutions when viewed in this light. HDFS-6134, as described at least by the JIRA title, wants to introduce transparent encryption within HDFS. In my opinion, it shouldn’t attempt “client side encryption on the server” for reasons mentioned above. HADOOP-10150 wants to make management of partially encrypted data easy for clients, for the client side encryption use cases, by presenting a filtered view over base Hadoop filesystems like HDFS. {quote} in the "Storage of IV and data key" is stated "So we implement extended information based on INode feature, and use it to store data key and IV. "{quote} We assume HDFS-2006 could help, that’s why we put separate patches. In that the CFS patch it was decoupled with underlying filesystem if xattr present. And it could be end user’s choice to decide whether store key alias or data encryption key. {quote}(Mentioned before), how thing flush() operations will be handled as the encryption block will be cut short? How this is handled on writes? How this is handled on reads?{quote} For hflush, hsync, actually it's very simple. In cryptographic output stream of CFS, we buffer the plain text in cache and do encryption until data size reaches buffer length to improve performance. So for hflush /hsync, we just need to flush the buffer and do encryption immediately, and then call FSDataOutputStream.hfulsh/hsync which will handle the remaining thing. {quote}Still, it is not clear how transparency will be achieved for existing applications: HDFS URI changes, clients must connect to the Key store to retrieve the encryption key (clients will need key store principals). The encryption key must be propagated to jobs tasks (i.e. Mapper/Reducer processes){quote} There is no URL changed, please see latest design doc and test case. We have considered HADOOP-9534 and HADOOP-10141, encryption of key material could be handled by the implementation of key providers according to customers environment. {quote}Use of AES-CTR (instead of an authenticated encryption mode such as AES-GCM){quote} AES-GCM was introduce addition CPU cycles by GHASH - 2.5x additional cycles in Sandy-Bridge and Ivy-Bridge, 0.6x additional cycle in Haswell. Data integrity was ensured by underlying filesystem like HDFS in this scenario. We decide to use AES-CTR for best performance. Furthermore, AES-GCM mode is not available as a JCE cipher in Java 6. It may be EOL but plenty of Hadoopers are still runni
[jira] [Commented] (HADOOP-10150) Hadoop cryptographic file system
[ https://issues.apache.org/jira/browse/HADOOP-10150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13946822#comment-13946822 ] Yi Liu commented on HADOOP-10150: - We less concern the internal use of HDFS client, on the contrary we care more about encrypted data easy for clients. Even though we found that in webhdfs it should use DistributedFileSystem as well to remove the symlink issue as HDFS-4933 stated(The issue we found is “Throwing UnresolvedPathException when getting HDFS symlink file through HDFS REST API”, and there is no “statistics” for HDFS REST which is inconsistent with behavior of DistributedFileSystem, suppose this JIRA will resolve it). “Transparent” or “at rest” encryption usually means that the server handles encrypting data for persistence, but does not manage keys for particular clients or applications, nor require applications to even be aware that encryption is in use. Hence how it can be described as transparent. This type of solution distributes secret keys within the secure enclave (not to clients), or might employ a two tier key architecture (data keys wrapped by the cluster secret key) but with keys managed per application typically. E.g. in a database system, per table. The goal here is to avoid data leakage from the server by universally encrypting data “at rest”. Other cryptographic application architectures handle use cases where clients or applications want to protect data with encryption from other clients or applications. For those use cases encryption and decryption is done on the client, and the scope of key sharing should be minimized to where the cryptographic operations take place. In this type of solution the server becomes an unnecessary central point of compromise for user or application keys, so sharing there should be avoided. This isn’t really an “at rest” solution because the client may or may not choose to encrypt, and because key sharing is minimized, the server cannot and should not be able to distinguish encrypted data from random bytes, so cannot guarantee all persisted data is encrypted. Therefore we have two different types of solutions useful for different reasons, with different threat models. Combinations of the two must be carefully done (or avoided) so as not to end up with something combining the worst of both threat models. HDFS-6134 and HADOOP-10150 are orthogonal and complimentary solutions when viewed in this light. HDFS-6134, as described at least by the JIRA title, wants to introduce transparent encryption within HDFS. In my opinion, it shouldn’t attempt “client side encryption on the server” for reasons mentioned above. HADOOP-10150 wants to make management of partially encrypted data easy for clients, for the client side encryption use cases, by presenting a filtered view over base Hadoop filesystems like HDFS.. { in the "Storage of IV and data key" is stated "So we implement extended information based on INode feature, and use it to store data key and IV. "} We assume HDFS-2006 could help, that’s why we put separate patches. In that the CFS patch it was decoupled with underlying filesystem if xattr present. And it could be end user’s choice to decide whether store key alias or data encryption key. {(Mentioned before), how thing flush() operations will be handled as the encryption block will be cut short? How this is handled on writes? How this is handled on reads?} For hflush, hsync, actually it's very simple. In cryptographic output stream of CFS, we buffer the plain text in cache and do encryption until data size reaches buffer length to improve performance. So for hflush /hsync, we just need to flush the buffer and do encryption immediately, and then call FSDataOutputStream.hfulsh/hsync which will handle the remaining thing. {Still, it is not clear how transparency will be achieved for existing applications: HDFS URI changes, clients must connect to the Key store to retrieve the encryption key (clients will need key store principals). The encryption key must be propagated to jobs tasks (i.e. Mapper/Reducer processes)} There is no URL changed, please see latest design doc and test case. We have considered HADOOP-9534 and HADOOP-10141, encryption of key material could be handled by the implementation of key providers according to customers environment. {Use of AES-CTR (instead of an authenticated encryption mode such as AES-GCM)} AES-GCM was introduce addition CPU cycles by GHASH - 2.5x additional cycles in Sandy-Bridge and Ivy-Bridge, 0.6x additional cycle in Haswell. Data integrity was ensured by underlying filesystem like HDFS in this scenario. We decide to use AES-CTR for best performance. Furthermore, AES-GCM mode is not available as a JCE cipher in Java 6. It may be EOL but plenty of Hadoopers are still running it. It's not even listed on the Java 7 Sun provider document (http://
[jira] [Commented] (HADOOP-10150) Hadoop cryptographic file system
[ https://issues.apache.org/jira/browse/HADOOP-10150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13946037#comment-13946037 ] Avik Dey commented on HADOOP-10150: --- [~tucu00] there are two patches posted by [~hitliuyi]. if you apply the xattrs patch first, the cfs patch should then apply cleanly: https://issues.apache.org/jira/secure/attachment/12636026/extended%20information%20based%20on%20INode%20feature.patch in the posted cfs patch you will see there is no need to change HDFS URI e.g.. i thought that was in the latest doc, but guess not. anyway let me know if you are still unable to apply the patches, that i think may help clear up a few of the questions you have posted. > Hadoop cryptographic file system > > > Key: HADOOP-10150 > URL: https://issues.apache.org/jira/browse/HADOOP-10150 > Project: Hadoop Common > Issue Type: New Feature > Components: security >Affects Versions: 3.0.0 >Reporter: Yi Liu >Assignee: Yi Liu > Labels: rhino > Fix For: 3.0.0 > > Attachments: CryptographicFileSystem.patch, HADOOP cryptographic file > system-V2.docx, HADOOP cryptographic file system.pdf, cfs.patch, extended > information based on INode feature.patch > > > There is an increasing need for securing data when Hadoop customers use > various upper layer applications, such as Map-Reduce, Hive, Pig, HBase and so > on. > HADOOP CFS (HADOOP Cryptographic File System) is used to secure data, based > on HADOOP “FilterFileSystem” decorating DFS or other file systems, and > transparent to upper layer applications. It’s configurable, scalable and fast. > High level requirements: > 1.Transparent to and no modification required for upper layer > applications. > 2.“Seek”, “PositionedReadable” are supported for input stream of CFS if > the wrapped file system supports them. > 3.Very high performance for encryption and decryption, they will not > become bottleneck. > 4.Can decorate HDFS and all other file systems in Hadoop, and will not > modify existing structure of file system, such as namenode and datanode > structure if the wrapped file system is HDFS. > 5.Admin can configure encryption policies, such as which directory will > be encrypted. > 6.A robust key management framework. > 7.Support Pread and append operations if the wrapped file system supports > them. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HADOOP-10150) Hadoop cryptographic file system
[ https://issues.apache.org/jira/browse/HADOOP-10150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13945577#comment-13945577 ] Alejandro Abdelnur commented on HADOOP-10150: - (Cross-posting HADOOP-10150 & HDFS-6134] [~avik_...@yahoo.com], I’ve just looked at the MAR/21 proposal in HADOOP-10150 (the patches uploaded on MAR/21 do not apply on trunk cleanly, so I cannot look at them easily. It seems to have missing pieces, like getXAttrs() and wiring to KeyProvider API. Would be possible to rebased them so they apply to trunk?) bq. do we need a new proposal for the work already being done on HADOOP-10150? HADOOP-10150 aims to provide encryption for any filesystem implementation as a decorator filesystem. While HDFS-6134 aims to provide encryption for HDFS. The 2 approaches differ on the level of transparency you get. The comparison table in the "HDFS Data at Rest Encryption" attachment (https://issues.apache.org/jira/secure/attachment/12635964/HDFSDataAtRestEncryption.pdf) highlights the differences. Particularly, the things I’m concerned the most with HADOOP-10150 are: * All clients (doing encryption/decryption) must have access the key management service. * Secure key propagation to tasks running in the cluster (i.e. mapper and reducer tasks) * Use of AES-CTR (instead of an authenticated encryption mode such as AES-GCM) * Not clear how hflush() bq. are there design choices in this proposal that are superior to the patch already provided on HADOOP-10150? IMO, a consolidated access/distribution of keys by the NN (as opposed to every client) improves the security of the system. bq. do you have additional requirement listed in this JIRA that could be incorporated in to HADOOP-10150, They are enumerated in the "HDFS Data at Rest Encryption" attachment. The ones I don’t see them address in HADOOP-10150 are: #6, #8.A. And it is not clear how #4 & #5 can be achieved. bq. so we can collaborate and not duplicate? Definitely, I want to work together with you guys to leverage as much as posible. Either by unifying the 2 proposal or by sharing common code if we think both approaches have merits and we decide to move forward with both. Happy to jump on a call to discuss things and the report back to the community if you think that will speed up the discussion. -- By looking at the latest design doc of HADOOP-10150 I can see that things have been modified a bit (from the original design doc) bringing it a bit closer to some of the HDFS-6134 requirements. Still, it is not clear how transparency will be achieved for existing applications: HDFS URI changes, clients must connect to the Key store to retrieve the encryption key (clients will need key store principals). The encryption key must be propagated to jobs tasks (i.e. Mapper/Reducer processes) Requirement #4 "Can decorate HDFS and all other file systems in Hadoop, and will not modify existing structure of file system, such as namenode and datanode structure if the wrapped file system is HDFS." This is contradicted by the design, in the "Storage of IV and data key" is stated "So we implement extended information based on INode feature, and use it to store data key and IV. " Requirement #5 "Admin can configure encryption policies, such as which directory will be encrypted.", this seems driven by HDFS client configuration file (hdfs-site.xml). This is not really admin driven as clients could break this by configuring their hdfs-site.xml file) Restrictions of move operations for files within an encrypted directory. The original design had something about it (not entirely correct), now is gone. (Mentioned before), how thing flush() operations will be handled as the encryption block will be cut short? How this is handled on writes? How this is handled on reads? Explicit auditing on encrypted files access does not seem handled. > Hadoop cryptographic file system > > > Key: HADOOP-10150 > URL: https://issues.apache.org/jira/browse/HADOOP-10150 > Project: Hadoop Common > Issue Type: New Feature > Components: security >Affects Versions: 3.0.0 >Reporter: Yi Liu >Assignee: Yi Liu > Labels: rhino > Fix For: 3.0.0 > > Attachments: CryptographicFileSystem.patch, HADOOP cryptographic file > system-V2.docx, HADOOP cryptographic file system.pdf, cfs.patch, extended > information based on INode feature.patch > > > There is an increasing need for securing data when Hadoop customers use > various upper layer applications, such as Map-Reduce, Hive, Pig, HBase and so > on. > HADOOP CFS (HADOOP Cryptographic File System) is used to secure data, based > on HADOOP “FilterFileSystem” decorating DFS or other file systems, and > transparent to upper layer applications. It’s confi
[jira] [Commented] (HADOOP-10150) Hadoop cryptographic file system
[ https://issues.apache.org/jira/browse/HADOOP-10150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13846078#comment-13846078 ] Yi Liu commented on HADOOP-10150: - Larry, the patch attached to HADOOP-10156 as subtask of HADOOP-10150 is a pure java implementation without any external dependencies. The first patch we put up did contain hadoop-crypto, a crypto codec framework which includes some non-Java code implemented using C. However, the latest patch on HADOOP-10156 instead provides ciphers using the standard javax.security.Cipher interface, and cipher implementations that are shipped with the JRE by default, instead of hadoop-crypto. Java by itself provides the mechanism to allow supplement Cipher implementations, the JCE (Java Cryptography Extension). Because the default JCE provider shipped with common JREs do not utilize hardware acceleration (AES-NI) that has been available for years, we have also developed a pure open source Apache 2 licensed JCE provider named Diceros to mitigate the performance penalties. Our initial tests shows 20x improvement over ciphers shipped with JRE 7. We would like to contribute Diceros also, but to simply review for now we are hosting Diceros on GitHub. The code submitted for HADOOP-10156 allows the end user to configure any kind of JCE provider - for example, it can be the default JCE provider shipped with JREs, Diceros ("DC") or BouncyCastle ("BC"). Please let me know if you have any other concerns about this approach. Thanks. > Hadoop cryptographic file system > > > Key: HADOOP-10150 > URL: https://issues.apache.org/jira/browse/HADOOP-10150 > Project: Hadoop Common > Issue Type: New Feature > Components: security >Affects Versions: 3.0.0 >Reporter: Yi Liu >Assignee: Yi Liu > Labels: rhino > Fix For: 3.0.0 > > Attachments: CryptographicFileSystem.patch, HADOOP cryptographic file > system.pdf > > > There is an increasing need for securing data when Hadoop customers use > various upper layer applications, such as Map-Reduce, Hive, Pig, HBase and so > on. > HADOOP CFS (HADOOP Cryptographic File System) is used to secure data, based > on HADOOP “FilterFileSystem” decorating DFS or other file systems, and > transparent to upper layer applications. It’s configurable, scalable and fast. > High level requirements: > 1.Transparent to and no modification required for upper layer > applications. > 2.“Seek”, “PositionedReadable” are supported for input stream of CFS if > the wrapped file system supports them. > 3.Very high performance for encryption and decryption, they will not > become bottleneck. > 4.Can decorate HDFS and all other file systems in Hadoop, and will not > modify existing structure of file system, such as namenode and datanode > structure if the wrapped file system is HDFS. > 5.Admin can configure encryption policies, such as which directory will > be encrypted. > 6.A robust key management framework. > 7.Support Pread and append operations if the wrapped file system supports > them. -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Commented] (HADOOP-10150) Hadoop cryptographic file system
[ https://issues.apache.org/jira/browse/HADOOP-10150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13844371#comment-13844371 ] Larry McCay commented on HADOOP-10150: -- Hi Yi - I am a bit confused by this latest comment. Can you please clarify "hadoop-crypto component was removed from latest patch as a result of Diceros emerging. "? Are you saying that initially you had a cipher provider implementation but have decided not to provide one since there is one available in yet another non-apache project? I don't believe that these sorts of external references are really appropriate. Neither Rhino or Diceros are a TLP or incubation project in Apache. Since it appears to be an intel specific implementation, it seems appropriate to remove it from the patch though. Do you plan to provide an all java implementation for this work? > Hadoop cryptographic file system > > > Key: HADOOP-10150 > URL: https://issues.apache.org/jira/browse/HADOOP-10150 > Project: Hadoop Common > Issue Type: New Feature > Components: security >Affects Versions: 3.0.0 >Reporter: Yi Liu >Assignee: Yi Liu > Labels: rhino > Fix For: 3.0.0 > > Attachments: CryptographicFileSystem.patch, HADOOP cryptographic file > system.pdf > > > There is an increasing need for securing data when Hadoop customers use > various upper layer applications, such as Map-Reduce, Hive, Pig, HBase and so > on. > HADOOP CFS (HADOOP Cryptographic File System) is used to secure data, based > on HADOOP “FilterFileSystem” decorating DFS or other file systems, and > transparent to upper layer applications. It’s configurable, scalable and fast. > High level requirements: > 1.Transparent to and no modification required for upper layer > applications. > 2.“Seek”, “PositionedReadable” are supported for input stream of CFS if > the wrapped file system supports them. > 3.Very high performance for encryption and decryption, they will not > become bottleneck. > 4.Can decorate HDFS and all other file systems in Hadoop, and will not > modify existing structure of file system, such as namenode and datanode > structure if the wrapped file system is HDFS. > 5.Admin can configure encryption policies, such as which directory will > be encrypted. > 6.A robust key management framework. > 7.Support Pread and append operations if the wrapped file system supports > them. -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Commented] (HADOOP-10150) Hadoop cryptographic file system
[ https://issues.apache.org/jira/browse/HADOOP-10150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13844347#comment-13844347 ] Yi Liu commented on HADOOP-10150: - Create a sub task: HADOOP-10156: This JIRA defines Encryptor and Decryptor which are buffer-based interfaces for encryption and decryption. Standard javax.security.Cipher interface was employed to provide AES/CTR encryption/decryption implemention. In this way, one can replace javax.security.Cipher implementation by plug other JCE provider such as Diceros. Diceros was opensource project under Rhino project, implement a set of Cipher interface which provide high performance encyption/decryption compared to default JCE provider. The initial performance test result shows 20x speedup in CTR mode compared to default JCE provider in JDK 1.7_u45. Moreover, Encryptor/Decryptor interfaces implements a internal buffer to further improve the performance over javax.security.Cipher. hadoop-crypto component was removed from latest patch as a result of Diceros emerging. One can use "cfs.cipher.provider" to specify the JCE provider, for example, Diceros project link: https://github.com/intel-hadoop/diceros > Hadoop cryptographic file system > > > Key: HADOOP-10150 > URL: https://issues.apache.org/jira/browse/HADOOP-10150 > Project: Hadoop Common > Issue Type: New Feature > Components: security >Affects Versions: 3.0.0 >Reporter: Yi Liu >Assignee: Yi Liu > Labels: rhino > Fix For: 3.0.0 > > Attachments: CryptographicFileSystem.patch, HADOOP cryptographic file > system.pdf > > > There is an increasing need for securing data when Hadoop customers use > various upper layer applications, such as Map-Reduce, Hive, Pig, HBase and so > on. > HADOOP CFS (HADOOP Cryptographic File System) is used to secure data, based > on HADOOP “FilterFileSystem” decorating DFS or other file systems, and > transparent to upper layer applications. It’s configurable, scalable and fast. > High level requirements: > 1.Transparent to and no modification required for upper layer > applications. > 2.“Seek”, “PositionedReadable” are supported for input stream of CFS if > the wrapped file system supports them. > 3.Very high performance for encryption and decryption, they will not > become bottleneck. > 4.Can decorate HDFS and all other file systems in Hadoop, and will not > modify existing structure of file system, such as namenode and datanode > structure if the wrapped file system is HDFS. > 5.Admin can configure encryption policies, such as which directory will > be encrypted. > 6.A robust key management framework. > 7.Support Pread and append operations if the wrapped file system supports > them. -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Commented] (HADOOP-10150) Hadoop cryptographic file system
[ https://issues.apache.org/jira/browse/HADOOP-10150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13843114#comment-13843114 ] Yi Liu commented on HADOOP-10150: - Hi Owen, I have filed 5 sub tasks, and initial patches will be attached later. I want to use HADOOP-10149 to attach ByteBufferCipher API patch. > Hadoop cryptographic file system > > > Key: HADOOP-10150 > URL: https://issues.apache.org/jira/browse/HADOOP-10150 > Project: Hadoop Common > Issue Type: New Feature > Components: security >Affects Versions: 3.0.0 >Reporter: Yi Liu >Assignee: Yi Liu > Labels: rhino > Fix For: 3.0.0 > > Attachments: CryptographicFileSystem.patch, HADOOP cryptographic file > system.pdf > > > There is an increasing need for securing data when Hadoop customers use > various upper layer applications, such as Map-Reduce, Hive, Pig, HBase and so > on. > HADOOP CFS (HADOOP Cryptographic File System) is used to secure data, based > on HADOOP “FilterFileSystem” decorating DFS or other file systems, and > transparent to upper layer applications. It’s configurable, scalable and fast. > High level requirements: > 1.Transparent to and no modification required for upper layer > applications. > 2.“Seek”, “PositionedReadable” are supported for input stream of CFS if > the wrapped file system supports them. > 3.Very high performance for encryption and decryption, they will not > become bottleneck. > 4.Can decorate HDFS and all other file systems in Hadoop, and will not > modify existing structure of file system, such as namenode and datanode > structure if the wrapped file system is HDFS. > 5.Admin can configure encryption policies, such as which directory will > be encrypted. > 6.A robust key management framework. > 7.Support Pread and append operations if the wrapped file system supports > them. -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Commented] (HADOOP-10150) Hadoop cryptographic file system
[ https://issues.apache.org/jira/browse/HADOOP-10150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13843098#comment-13843098 ] Yi Liu commented on HADOOP-10150: - Thanks Uma, I am working on breakdown patches and creating sub-task Jiras. I will convert this JIRA to common project. > Hadoop cryptographic file system > > > Key: HADOOP-10150 > URL: https://issues.apache.org/jira/browse/HADOOP-10150 > Project: Hadoop Common > Issue Type: New Feature > Components: security >Affects Versions: 3.0.0 >Reporter: Yi Liu >Assignee: Yi Liu > Labels: rhino > Fix For: 3.0.0 > > Attachments: CryptographicFileSystem.patch, HADOOP cryptographic file > system.pdf > > > There is an increasing need for securing data when Hadoop customers use > various upper layer applications, such as Map-Reduce, Hive, Pig, HBase and so > on. > HADOOP CFS (HADOOP Cryptographic File System) is used to secure data, based > on HADOOP “FilterFileSystem” decorating DFS or other file systems, and > transparent to upper layer applications. It’s configurable, scalable and fast. > High level requirements: > 1.Transparent to and no modification required for upper layer > applications. > 2.“Seek”, “PositionedReadable” are supported for input stream of CFS if > the wrapped file system supports them. > 3.Very high performance for encryption and decryption, they will not > become bottleneck. > 4.Can decorate HDFS and all other file systems in Hadoop, and will not > modify existing structure of file system, such as namenode and datanode > structure if the wrapped file system is HDFS. > 5.Admin can configure encryption policies, such as which directory will > be encrypted. > 6.A robust key management framework. > 7.Support Pread and append operations if the wrapped file system supports > them. -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Commented] (HADOOP-10150) Hadoop cryptographic file system
[ https://issues.apache.org/jira/browse/HADOOP-10150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13843097#comment-13843097 ] Yi Liu commented on HADOOP-10150: - Hi Owen, thanks for bringing it up here. I am working on breaking down the patches and creating sub-task JIRAs as already mentioned in my previous response. Rest of your comment seems to be about a different JIRA and is probably best discussed on that JIRA. * HADOOP-10149: since I have that patch already implemented, do you mind assigning it to me? I will take that piece of code and apply there for review. * Since HADOOP-10141 tries to improve on HADOOP-9333, why not provide your feedback on HADOOP-9333 instead of opening a JIRA that duplicates part of that work? > Hadoop cryptographic file system > > > Key: HADOOP-10150 > URL: https://issues.apache.org/jira/browse/HADOOP-10150 > Project: Hadoop Common > Issue Type: New Feature > Components: security >Affects Versions: 3.0.0 >Reporter: Yi Liu >Assignee: Yi Liu > Labels: rhino > Fix For: 3.0.0 > > Attachments: CryptographicFileSystem.patch, HADOOP cryptographic file > system.pdf > > > There is an increasing need for securing data when Hadoop customers use > various upper layer applications, such as Map-Reduce, Hive, Pig, HBase and so > on. > HADOOP CFS (HADOOP Cryptographic File System) is used to secure data, based > on HADOOP “FilterFileSystem” decorating DFS or other file systems, and > transparent to upper layer applications. It’s configurable, scalable and fast. > High level requirements: > 1.Transparent to and no modification required for upper layer > applications. > 2.“Seek”, “PositionedReadable” are supported for input stream of CFS if > the wrapped file system supports them. > 3.Very high performance for encryption and decryption, they will not > become bottleneck. > 4.Can decorate HDFS and all other file systems in Hadoop, and will not > modify existing structure of file system, such as namenode and datanode > structure if the wrapped file system is HDFS. > 5.Admin can configure encryption policies, such as which directory will > be encrypted. > 6.A robust key management framework. > 7.Support Pread and append operations if the wrapped file system supports > them. -- This message was sent by Atlassian JIRA (v6.1.4#6159)