[jira] [Updated] (HDFS-6591) while loop is executed tens of thousands of times in Hedged Read
[ https://issues.apache.org/jira/browse/HDFS-6591?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Nauroth updated HDFS-6591: Target Version/s: 3.0.0, 2.5.0 Hadoop Flags: Reviewed Thanks for the explanation. Yes, I had been looking at an outdated patch, so I apologize for the confusion. This all makes sense now. +1 for the patch. I can commit it in the morning. while loop is executed tens of thousands of times in Hedged Read -- Key: HDFS-6591 URL: https://issues.apache.org/jira/browse/HDFS-6591 Project: Hadoop HDFS Issue Type: Bug Components: hdfs-client Affects Versions: 2.4.0 Reporter: LiuLei Assignee: Liang Xie Attachments: HDFS-6591-v2.txt, HDFS-6591-v3.txt, HDFS-6591.txt, LoopTooManyTimesTestCase.patch I download hadoop-2.4.1-rc1 code from http://people.apache.org/~acmurthy/hadoop-2.4.1-rc1/, I test the Hedged Read. I find the while loop in hedgedFetchBlockByteRange method is executed tens of thousands of times. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6591) while loop is executed tens of thousands of times in Hedged Read
[ https://issues.apache.org/jira/browse/HDFS-6591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14045632#comment-14045632 ] Hadoop QA commented on HDFS-6591: - {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12652746/HDFS-6591-v3.txt against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-hdfs-project/hadoop-hdfs. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/7237//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7237//console This message is automatically generated. while loop is executed tens of thousands of times in Hedged Read -- Key: HDFS-6591 URL: https://issues.apache.org/jira/browse/HDFS-6591 Project: Hadoop HDFS Issue Type: Bug Components: hdfs-client Affects Versions: 2.4.0 Reporter: LiuLei Assignee: Liang Xie Attachments: HDFS-6591-v2.txt, HDFS-6591-v3.txt, HDFS-6591.txt, LoopTooManyTimesTestCase.patch I download hadoop-2.4.1-rc1 code from http://people.apache.org/~acmurthy/hadoop-2.4.1-rc1/, I test the Hedged Read. I find the while loop in hedgedFetchBlockByteRange method is executed tens of thousands of times. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6596) Improve InputStream when read spans two blocks
[ https://issues.apache.org/jira/browse/HDFS-6596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zesheng Wu updated HDFS-6596: - Attachment: HDFS-6596.2.patch Uploaded a polished version. Improve InputStream when read spans two blocks -- Key: HDFS-6596 URL: https://issues.apache.org/jira/browse/HDFS-6596 Project: Hadoop HDFS Issue Type: Improvement Components: hdfs-client Affects Versions: 2.4.0 Reporter: Zesheng Wu Assignee: Zesheng Wu Attachments: HDFS-6596.1.patch, HDFS-6596.2.patch In the current implementation of DFSInputStream, read(buffer, offset, length) is implemented as following: {code} int realLen = (int) Math.min(len, (blockEnd - pos + 1L)); if (locatedBlocks.isLastBlockComplete()) { realLen = (int) Math.min(realLen, locatedBlocks.getFileLength()); } int result = readBuffer(strategy, off, realLen, corruptedBlockMap); {code} From the above code, we can conclude that the read will return at most (blockEnd - pos + 1) bytes. As a result, when read spans two blocks, the caller must call read() second time to complete the request, and must wait second time to acquire the DFSInputStream lock(read() is synchronized for DFSInputStream). For latency sensitive applications, such as hbase, this will result in latency pain point when they under massive race conditions. So here we propose that we should loop internally in read() to do best effort read. In the current implementation of pread(read(position, buffer, offset, lenght)), it does loop internally to do best effort read. So we can refactor to support this on normal read. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6556) Refine XAttr permissions
[ https://issues.apache.org/jira/browse/HDFS-6556?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uma Maheswara Rao G updated HDFS-6556: -- Attachment: refinedPermissions-HDFS-6556-3.patch Thanks a lot, Chris for the review! Attached a patch which should address your suggestion. Refine XAttr permissions Key: HDFS-6556 URL: https://issues.apache.org/jira/browse/HDFS-6556 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.5.0 Reporter: Yi Liu Assignee: Uma Maheswara Rao G Attachments: RefinedPermissions-HDFS-6556-1.patch, RefinedPermissions-HDFS-6556.patch, refinedPermissions-HDFS-6556-2.patch, refinedPermissions-HDFS-6556-3.patch After discuss with Uma, we should refine setting permissions of {{user}} and {{trusted}} namespace xattrs. *1.* For {{user}} namespace xattrs, In HDFS-6374, says setXAttr should require the user to be the owner of the file or directory, we have a bit misunderstanding. It actually is: {quote} The access permissions for user attributes are defined by the file permission bits. only regular files and directories can have extended attributes. For sticky directories, only the owner and privileged user can write attributes. {quote} We can refer to linux source code in http://lxr.free-electrons.com/source/fs/xattr.c?v=2.6.35 I also check in linux, it's controlled by the file permission bits for regular files and directories (not sticky). *2.* For {{trusted}} namespace, currently we require the user should be owner and superuser. Actually superuser is enough. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6596) Improve InputStream when read spans two blocks
[ https://issues.apache.org/jira/browse/HDFS-6596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14045789#comment-14045789 ] Hadoop QA commented on HDFS-6596: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12652771/HDFS-6596.2.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:red}-1 javadoc{color}. The javadoc tool appears to have generated 1 warning messages. See https://builds.apache.org/job/PreCommit-HDFS-Build/7238//artifact/trunk/patchprocess/diffJavadocWarnings.txt for details. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.server.datanode.TestBlockRecovery {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/7238//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7238//console This message is automatically generated. Improve InputStream when read spans two blocks -- Key: HDFS-6596 URL: https://issues.apache.org/jira/browse/HDFS-6596 Project: Hadoop HDFS Issue Type: Improvement Components: hdfs-client Affects Versions: 2.4.0 Reporter: Zesheng Wu Assignee: Zesheng Wu Attachments: HDFS-6596.1.patch, HDFS-6596.2.patch In the current implementation of DFSInputStream, read(buffer, offset, length) is implemented as following: {code} int realLen = (int) Math.min(len, (blockEnd - pos + 1L)); if (locatedBlocks.isLastBlockComplete()) { realLen = (int) Math.min(realLen, locatedBlocks.getFileLength()); } int result = readBuffer(strategy, off, realLen, corruptedBlockMap); {code} From the above code, we can conclude that the read will return at most (blockEnd - pos + 1) bytes. As a result, when read spans two blocks, the caller must call read() second time to complete the request, and must wait second time to acquire the DFSInputStream lock(read() is synchronized for DFSInputStream). For latency sensitive applications, such as hbase, this will result in latency pain point when they under massive race conditions. So here we propose that we should loop internally in read() to do best effort read. In the current implementation of pread(read(position, buffer, offset, lenght)), it does loop internally to do best effort read. So we can refactor to support this on normal read. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6527) Edit log corruption due to defered INode removal
[ https://issues.apache.org/jira/browse/HDFS-6527?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aaron T. Myers updated HDFS-6527: - Attachment: HDFS-6527-addendum-test.patch Hey folks, I've been looking into this a bit and have come to the conclusion that we should actually include this fix in 2.4.1. The reason is that though the original {{addBlock}} scenario sort of incidentally can't happen in 2.4.0, I believe that a similar scenario can happen with a race between {{close}} and {{delete}}. Even though {{close}} doesn't do any sort of dropping of its lock during the duration of its RPC, the entirety of a single {{close}} operation can begin and end successfully between when the {{delete}} edit log op is logged, and when the INode is later removed in the {{delete}} call. See the attached additional test case which demonstrates the issue. This will result in a similarly invalid edit log op sequence wherein you'll see an {{OP_ADD}}, {{OP_DELETE}}, and then {{OP_CLOSE}}, which can't be successfully replayed by the NN since the {{OP_CLOSE}} will get a {{FileNotFound}}. I've seen this happen on two clusters now. Kihwal/Jing - if you agree with my analysis, let's reopen this JIRA so this fix can be included in 2.4.1, though without the {{addBlock}} test case, and with only the {{close}} test case. Edit log corruption due to defered INode removal Key: HDFS-6527 URL: https://issues.apache.org/jira/browse/HDFS-6527 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.4.0 Reporter: Kihwal Lee Assignee: Kihwal Lee Priority: Blocker Fix For: 3.0.0, 2.5.0 Attachments: HDFS-6527-addendum-test.patch, HDFS-6527.branch-2.4.patch, HDFS-6527.trunk.patch, HDFS-6527.v2.patch, HDFS-6527.v3.patch, HDFS-6527.v4.patch, HDFS-6527.v5.patch We have seen a SBN crashing with the following error: {panel} \[Edit log tailer\] ERROR namenode.FSEditLogLoader: Encountered exception on operation AddBlockOp [path=/xxx, penultimateBlock=NULL, lastBlock=blk_111_111, RpcClientId=, RpcCallId=-2] java.io.FileNotFoundException: File does not exist: /xxx {panel} This was caused by the deferred removal of deleted inodes from the inode map. Since getAdditionalBlock() acquires FSN read lock and then write lock, a deletion can happen in between. Because of deferred inode removal outside FSN write lock, getAdditionalBlock() can get the deleted inode from the inode map with FSN write lock held. This allow addition of a block to a deleted file. As a result, the edit log will contain OP_ADD, OP_DELETE, followed by OP_ADD_BLOCK. This cannot be replayed by NN, so NN doesn't start up or SBN crashes. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6572) Add an option to the NameNode that prints the software and on-disk image versions
[ https://issues.apache.org/jira/browse/HDFS-6572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14045819#comment-14045819 ] Hudson commented on HDFS-6572: -- FAILURE: Integrated in Hadoop-Yarn-trunk #596 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/596/]) HDFS-6572. Add an option to the NameNode that prints the software and on-disk image versions. Contributed by Charles Lamb. (cnauroth: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1605928) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/common/HdfsServerConstants.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSImage.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNode.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestMetadataVersionOutput.java Add an option to the NameNode that prints the software and on-disk image versions - Key: HDFS-6572 URL: https://issues.apache.org/jira/browse/HDFS-6572 Project: Hadoop HDFS Issue Type: Bug Components: namenode Reporter: Charles Lamb Assignee: Charles Lamb Priority: Minor Fix For: 3.0.0, 2.5.0 Attachments: HDFS-6572.001.patch, HDFS-6572.002.patch The HDFS namenode should have a startup option that prints the metadata versions of both the software and the on-disk version. This will be useful for debugging certain situations. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6556) Refine XAttr permissions
[ https://issues.apache.org/jira/browse/HDFS-6556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14045922#comment-14045922 ] Hadoop QA commented on HDFS-6556: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12652803/refinedPermissions-HDFS-6556-3.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.server.namenode.TestNameNodeHttpServer {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/7239//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7239//console This message is automatically generated. Refine XAttr permissions Key: HDFS-6556 URL: https://issues.apache.org/jira/browse/HDFS-6556 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.5.0 Reporter: Yi Liu Assignee: Uma Maheswara Rao G Attachments: RefinedPermissions-HDFS-6556-1.patch, RefinedPermissions-HDFS-6556.patch, refinedPermissions-HDFS-6556-2.patch, refinedPermissions-HDFS-6556-3.patch After discuss with Uma, we should refine setting permissions of {{user}} and {{trusted}} namespace xattrs. *1.* For {{user}} namespace xattrs, In HDFS-6374, says setXAttr should require the user to be the owner of the file or directory, we have a bit misunderstanding. It actually is: {quote} The access permissions for user attributes are defined by the file permission bits. only regular files and directories can have extended attributes. For sticky directories, only the owner and privileged user can write attributes. {quote} We can refer to linux source code in http://lxr.free-electrons.com/source/fs/xattr.c?v=2.6.35 I also check in linux, it's controlled by the file permission bits for regular files and directories (not sticky). *2.* For {{trusted}} namespace, currently we require the user should be owner and superuser. Actually superuser is enough. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6572) Add an option to the NameNode that prints the software and on-disk image versions
[ https://issues.apache.org/jira/browse/HDFS-6572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14045977#comment-14045977 ] Hudson commented on HDFS-6572: -- FAILURE: Integrated in Hadoop-Hdfs-trunk #1787 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1787/]) HDFS-6572. Add an option to the NameNode that prints the software and on-disk image versions. Contributed by Charles Lamb. (cnauroth: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1605928) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/common/HdfsServerConstants.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSImage.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNode.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestMetadataVersionOutput.java Add an option to the NameNode that prints the software and on-disk image versions - Key: HDFS-6572 URL: https://issues.apache.org/jira/browse/HDFS-6572 Project: Hadoop HDFS Issue Type: Bug Components: namenode Reporter: Charles Lamb Assignee: Charles Lamb Priority: Minor Fix For: 3.0.0, 2.5.0 Attachments: HDFS-6572.001.patch, HDFS-6572.002.patch The HDFS namenode should have a startup option that prints the metadata versions of both the software and the on-disk version. This will be useful for debugging certain situations. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6601) Issues in finalizing rolling upgrade when there is a layout version change
[ https://issues.apache.org/jira/browse/HDFS-6601?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14045990#comment-14045990 ] Kihwal Lee commented on HDFS-6601: -- [~jingzhao] Would you review the patch? We've tested the NN rolling upgrades scenario involving layout version change with this patch. Issues in finalizing rolling upgrade when there is a layout version change -- Key: HDFS-6601 URL: https://issues.apache.org/jira/browse/HDFS-6601 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.5.0 Reporter: Kihwal Lee Assignee: Kihwal Lee Priority: Blocker Attachments: HDFS-6601.patch After HDFS-6545, we have noticed a couple of issues. - The storage dir's VERSION file is not properly updated. This becomes a problem when there is a layout version change. We can have the finalization do {{storage.writeAll()}} - {{OP_ROLLING_UPGRADE_FINALIZE}} cannot be replayed, once the corresponding {{OP_ROLLING_UPGRADE_START}} is consumed and a new fsimage is created (e.g. rollback image). On restart, NN terminates complaining it can't finalize something that it didn't start. We can make NN ignore {{OP_ROLLING_UPGRADE_FINALIZE}} if no rolling upgrade is in progress. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6572) Add an option to the NameNode that prints the software and on-disk image versions
[ https://issues.apache.org/jira/browse/HDFS-6572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14046035#comment-14046035 ] Hudson commented on HDFS-6572: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk #1814 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1814/]) HDFS-6572. Add an option to the NameNode that prints the software and on-disk image versions. Contributed by Charles Lamb. (cnauroth: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1605928) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/common/HdfsServerConstants.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSImage.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNode.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestMetadataVersionOutput.java Add an option to the NameNode that prints the software and on-disk image versions - Key: HDFS-6572 URL: https://issues.apache.org/jira/browse/HDFS-6572 Project: Hadoop HDFS Issue Type: Bug Components: namenode Reporter: Charles Lamb Assignee: Charles Lamb Priority: Minor Fix For: 3.0.0, 2.5.0 Attachments: HDFS-6572.001.patch, HDFS-6572.002.patch The HDFS namenode should have a startup option that prints the metadata versions of both the software and the on-disk version. This will be useful for debugging certain situations. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6391) Get the Key/IV from the NameNode for encrypted files in DFSClient
[ https://issues.apache.org/jira/browse/HDFS-6391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14046043#comment-14046043 ] Charles Lamb commented on HDFS-6391: LGTM, +1 Sorry about my nits, but I appreciate your indulgence. My intent with moving the imports around was definitely not to have a wholesale reordering of FSD's imports, but only to get your new ones in something of a proper ordering. +1 that we should not do a full-scale auto-format of imports and +1 that removing unused imports is good since IDEs tend to whine about them. There's still a whitespace change in HdfsConstants, but it's close enough to the other change it probably doesn't matter in the scheme of things. There's also another one in FSD#isUserVisible. I leave it to you to decide if you want to remove it or not. I like the new names for the xattrs. Thanks for your work on this piece of the puzzle. Get the Key/IV from the NameNode for encrypted files in DFSClient - Key: HDFS-6391 URL: https://issues.apache.org/jira/browse/HDFS-6391 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode, security Reporter: Alejandro Abdelnur Assignee: Andrew Wang Attachments: HDFS-6391.1.patch, hdfs-6391.002.patch, hdfs-6391.003.patch When creating/opening and encrypted file, the DFSClient should get the encryption key material and the IV for the file in the create/open RPC call. HDFS admin users would never get key material/IV on encrypted files create/open. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6556) Refine XAttr permissions
[ https://issues.apache.org/jira/browse/HDFS-6556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14046055#comment-14046055 ] Uma Maheswara Rao G commented on HDFS-6556: --- Above test failure should be unrelated to this patch. Refine XAttr permissions Key: HDFS-6556 URL: https://issues.apache.org/jira/browse/HDFS-6556 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.5.0 Reporter: Yi Liu Assignee: Uma Maheswara Rao G Attachments: RefinedPermissions-HDFS-6556-1.patch, RefinedPermissions-HDFS-6556.patch, refinedPermissions-HDFS-6556-2.patch, refinedPermissions-HDFS-6556-3.patch After discuss with Uma, we should refine setting permissions of {{user}} and {{trusted}} namespace xattrs. *1.* For {{user}} namespace xattrs, In HDFS-6374, says setXAttr should require the user to be the owner of the file or directory, we have a bit misunderstanding. It actually is: {quote} The access permissions for user attributes are defined by the file permission bits. only regular files and directories can have extended attributes. For sticky directories, only the owner and privileged user can write attributes. {quote} We can refer to linux source code in http://lxr.free-electrons.com/source/fs/xattr.c?v=2.6.35 I also check in linux, it's controlled by the file permission bits for regular files and directories (not sticky). *2.* For {{trusted}} namespace, currently we require the user should be owner and superuser. Actually superuser is enough. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HDFS-6604) Disk space leak with shortcircuit
Giuseppe Reina created HDFS-6604: Summary: Disk space leak with shortcircuit Key: HDFS-6604 URL: https://issues.apache.org/jira/browse/HDFS-6604 Project: Hadoop HDFS Issue Type: Bug Components: hdfs-client Affects Versions: 2.4.0 Environment: Centos 6.5 and distribution Hortonworks Data Platform v2.1 Reporter: Giuseppe Reina When HDFS shortcircuit is enabled, the file descriptors of the deleted HDFS blocks are kept open until the cache is full. This prevents the operating system to free the space on disk. More details on the [mailing list thread|http://mail-archives.apache.org/mod_mbox/hbase-user/201406.mbox/%3CCAPjB-CA3RV=slhuhwue5cv3pc4+rffz10-tkydbfs9rt2de...@mail.gmail.com%3E] -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Comment Edited] (HDFS-6556) Refine XAttr permissions
[ https://issues.apache.org/jira/browse/HDFS-6556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14045788#comment-14045788 ] Uma Maheswara Rao G edited comment on HDFS-6556 at 6/27/14 3:38 PM: Thanks a lot, [~cnauroth] for the review! Attached a patch which should address your suggestion. was (Author: umamaheswararao): Thanks a lot, Chris for the review! Attached a patch which should address your suggestion. Refine XAttr permissions Key: HDFS-6556 URL: https://issues.apache.org/jira/browse/HDFS-6556 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.5.0 Reporter: Yi Liu Assignee: Uma Maheswara Rao G Attachments: RefinedPermissions-HDFS-6556-1.patch, RefinedPermissions-HDFS-6556.patch, refinedPermissions-HDFS-6556-2.patch, refinedPermissions-HDFS-6556-3.patch After discuss with Uma, we should refine setting permissions of {{user}} and {{trusted}} namespace xattrs. *1.* For {{user}} namespace xattrs, In HDFS-6374, says setXAttr should require the user to be the owner of the file or directory, we have a bit misunderstanding. It actually is: {quote} The access permissions for user attributes are defined by the file permission bits. only regular files and directories can have extended attributes. For sticky directories, only the owner and privileged user can write attributes. {quote} We can refer to linux source code in http://lxr.free-electrons.com/source/fs/xattr.c?v=2.6.35 I also check in linux, it's controlled by the file permission bits for regular files and directories (not sticky). *2.* For {{trusted}} namespace, currently we require the user should be owner and superuser. Actually superuser is enough. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6556) Refine XAttr permissions
[ https://issues.apache.org/jira/browse/HDFS-6556?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Nauroth updated HDFS-6556: Hadoop Flags: Reviewed +1 for the patch. Thanks, Uma! Yi, thank you for code reviewing. Refine XAttr permissions Key: HDFS-6556 URL: https://issues.apache.org/jira/browse/HDFS-6556 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.5.0 Reporter: Yi Liu Assignee: Uma Maheswara Rao G Attachments: RefinedPermissions-HDFS-6556-1.patch, RefinedPermissions-HDFS-6556.patch, refinedPermissions-HDFS-6556-2.patch, refinedPermissions-HDFS-6556-3.patch After discuss with Uma, we should refine setting permissions of {{user}} and {{trusted}} namespace xattrs. *1.* For {{user}} namespace xattrs, In HDFS-6374, says setXAttr should require the user to be the owner of the file or directory, we have a bit misunderstanding. It actually is: {quote} The access permissions for user attributes are defined by the file permission bits. only regular files and directories can have extended attributes. For sticky directories, only the owner and privileged user can write attributes. {quote} We can refer to linux source code in http://lxr.free-electrons.com/source/fs/xattr.c?v=2.6.35 I also check in linux, it's controlled by the file permission bits for regular files and directories (not sticky). *2.* For {{trusted}} namespace, currently we require the user should be owner and superuser. Actually superuser is enough. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6591) while loop is executed tens of thousands of times in Hedged Read
[ https://issues.apache.org/jira/browse/HDFS-6591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14046146#comment-14046146 ] Andrew Purtell commented on HDFS-6591: -- {quote} bq. Is it actually useful to add the new metric for HedgedReadOpsLoopNum During debugging, i feel it's very convenient to catch/verify some unnecessary loops, after a quick thinking, i didn't find any other easier choice to catch such case. How about keeping it there. when we find it's a hotspot or this feature is stable enough, then remove it ? (I can add a comment in v3 patch, so we'll easy to recall those stuff) {quote} Would it be useful to keep around a @VisibleForTesting counter instead and a unit test that prints the value of that counter periodically during the run? Because the metric 'HedgedReadOpsLoopNum' has no utility for operations. while loop is executed tens of thousands of times in Hedged Read -- Key: HDFS-6591 URL: https://issues.apache.org/jira/browse/HDFS-6591 Project: Hadoop HDFS Issue Type: Bug Components: hdfs-client Affects Versions: 2.4.0 Reporter: LiuLei Assignee: Liang Xie Attachments: HDFS-6591-v2.txt, HDFS-6591-v3.txt, HDFS-6591.txt, LoopTooManyTimesTestCase.patch I download hadoop-2.4.1-rc1 code from http://people.apache.org/~acmurthy/hadoop-2.4.1-rc1/, I test the Hedged Read. I find the while loop in hedgedFetchBlockByteRange method is executed tens of thousands of times. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6604) Disk space leak with shortcircuit
[ https://issues.apache.org/jira/browse/HDFS-6604?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haohui Mai updated HDFS-6604: - Priority: Critical (was: Major) Disk space leak with shortcircuit - Key: HDFS-6604 URL: https://issues.apache.org/jira/browse/HDFS-6604 Project: Hadoop HDFS Issue Type: Bug Components: hdfs-client Affects Versions: 2.4.0 Environment: Centos 6.5 and distribution Hortonworks Data Platform v2.1 Reporter: Giuseppe Reina Priority: Critical When HDFS shortcircuit is enabled, the file descriptors of the deleted HDFS blocks are kept open until the cache is full. This prevents the operating system to free the space on disk. More details on the [mailing list thread|http://mail-archives.apache.org/mod_mbox/hbase-user/201406.mbox/%3CCAPjB-CA3RV=slhuhwue5cv3pc4+rffz10-tkydbfs9rt2de...@mail.gmail.com%3E] -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6527) Edit log corruption due to defered INode removal
[ https://issues.apache.org/jira/browse/HDFS-6527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14046173#comment-14046173 ] Jing Zhao commented on HDFS-6527: - Thanks [~atm]! The analysis makes sense to me. Let's reopen the jira and fix it in 2.4.1. In the meanwhile, in FSNamesystem#delete, maybe we should move dir.removeFromInodeMap(removedINodes) into the fsnamesystem write lock? I guess this will prevent similar issue in the future. Edit log corruption due to defered INode removal Key: HDFS-6527 URL: https://issues.apache.org/jira/browse/HDFS-6527 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.4.0 Reporter: Kihwal Lee Assignee: Kihwal Lee Priority: Blocker Fix For: 3.0.0, 2.5.0 Attachments: HDFS-6527-addendum-test.patch, HDFS-6527.branch-2.4.patch, HDFS-6527.trunk.patch, HDFS-6527.v2.patch, HDFS-6527.v3.patch, HDFS-6527.v4.patch, HDFS-6527.v5.patch We have seen a SBN crashing with the following error: {panel} \[Edit log tailer\] ERROR namenode.FSEditLogLoader: Encountered exception on operation AddBlockOp [path=/xxx, penultimateBlock=NULL, lastBlock=blk_111_111, RpcClientId=, RpcCallId=-2] java.io.FileNotFoundException: File does not exist: /xxx {panel} This was caused by the deferred removal of deleted inodes from the inode map. Since getAdditionalBlock() acquires FSN read lock and then write lock, a deletion can happen in between. Because of deferred inode removal outside FSN write lock, getAdditionalBlock() can get the deleted inode from the inode map with FSN write lock held. This allow addition of a block to a deleted file. As a result, the edit log will contain OP_ADD, OP_DELETE, followed by OP_ADD_BLOCK. This cannot be replayed by NN, so NN doesn't start up or SBN crashes. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6527) Edit log corruption due to defered INode removal
[ https://issues.apache.org/jira/browse/HDFS-6527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14046194#comment-14046194 ] Aaron T. Myers commented on HDFS-6527: -- bq. In the meanwhile, in FSNamesystem#delete, maybe we should move dir.removeFromInodeMap(removedINodes) into the fsnamesystem write lock? I guess this will prevent similar issue in the future. This currently is done under the write lock, but it's done after having dropped the write lock briefly, so presumably you're proposing to make all of the contents of {{deleteInternal}} happen under a single write lock? That seems reasonable to me, but to be honest I'm not sure of the history of why this deferred inode removal was done in the first place. Edit log corruption due to defered INode removal Key: HDFS-6527 URL: https://issues.apache.org/jira/browse/HDFS-6527 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.4.0 Reporter: Kihwal Lee Assignee: Kihwal Lee Priority: Blocker Fix For: 3.0.0, 2.5.0 Attachments: HDFS-6527-addendum-test.patch, HDFS-6527.branch-2.4.patch, HDFS-6527.trunk.patch, HDFS-6527.v2.patch, HDFS-6527.v3.patch, HDFS-6527.v4.patch, HDFS-6527.v5.patch We have seen a SBN crashing with the following error: {panel} \[Edit log tailer\] ERROR namenode.FSEditLogLoader: Encountered exception on operation AddBlockOp [path=/xxx, penultimateBlock=NULL, lastBlock=blk_111_111, RpcClientId=, RpcCallId=-2] java.io.FileNotFoundException: File does not exist: /xxx {panel} This was caused by the deferred removal of deleted inodes from the inode map. Since getAdditionalBlock() acquires FSN read lock and then write lock, a deletion can happen in between. Because of deferred inode removal outside FSN write lock, getAdditionalBlock() can get the deleted inode from the inode map with FSN write lock held. This allow addition of a block to a deleted file. As a result, the edit log will contain OP_ADD, OP_DELETE, followed by OP_ADD_BLOCK. This cannot be replayed by NN, so NN doesn't start up or SBN crashes. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6591) while loop is executed tens of thousands of times in Hedged Read
[ https://issues.apache.org/jira/browse/HDFS-6591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14046193#comment-14046193 ] Chris Nauroth commented on HDFS-6591: - I agree with Andrew. The problem with publishing a metric is that technically it becomes part of our contract. If tools start querying the metric (even accidentally), then removing it in a later version is backwards-incompatible. I'm holding off on committing anyway, because I may have found another problem. I wanted to suggest removing the {{future.get()}} from here: {code} if (future != null) { future.get(); return; } {code} {{CompletionService#poll}} guarantees that the returned task (if not null) has completed. We don't need the result of the task, so the {{get}} should be unnecessary. However, when I remove that line, I start getting a test failure in {{TestPread#testPreadDFS}}: {code} testPreadDFS(org.apache.hadoop.hdfs.TestPread) Time elapsed: 2.127 sec FAILURE! java.lang.AssertionError: Pread Datanode Restart Test byte 0 differs. expected 47 actual 0 expected:0 but was:47 at org.junit.Assert.fail(Assert.java:88) at org.junit.Assert.failNotEquals(Assert.java:743) at org.junit.Assert.assertEquals(Assert.java:118) at org.junit.Assert.assertEquals(Assert.java:555) at org.apache.hadoop.hdfs.TestPread.checkAndEraseData(TestPread.java:92) at org.apache.hadoop.hdfs.TestPread.datanodeRestartTest(TestPread.java:225) at org.apache.hadoop.hdfs.TestPread.dfsPreadTest(TestPread.java:439) at org.apache.hadoop.hdfs.TestPread.testPreadDFS(TestPread.java:250) {code} I've isolated the problem to 2 tests running in sequence: {{testMaxOutHedgedReadPool}} followed by {{testPreadDFS}}. Removing {{future.get()}} shouldn't make a difference, so this might indicate a race condition that had been masked by the {{future.get()}} taking up a few extra cycles. Something about the way {{testMaxOutHedgedReadPool}} fills the thread pool seems to set up the timing conditions just right to trigger the test failure. I'm not yet certain what's causing this, but I have a theory. The return value of {{getFromOneDataNode}} gets submitted as a {{Future}}. As you pointed out, that code mutates the buffer that was passed in. If we've already returned to the caller, and then a background task lands late and starts mutating the buffer, then we could see unexpected results. We do cancel the unstarted tasks, but we don't interrupt them if they're already running. Even if we did interrupt them, I don't think we could guarantee interruption before it mutates the buffer. Let me know your thoughts on this. Thanks again for working on this tricky patch! while loop is executed tens of thousands of times in Hedged Read -- Key: HDFS-6591 URL: https://issues.apache.org/jira/browse/HDFS-6591 Project: Hadoop HDFS Issue Type: Bug Components: hdfs-client Affects Versions: 2.4.0 Reporter: LiuLei Assignee: Liang Xie Attachments: HDFS-6591-v2.txt, HDFS-6591-v3.txt, HDFS-6591.txt, LoopTooManyTimesTestCase.patch I download hadoop-2.4.1-rc1 code from http://people.apache.org/~acmurthy/hadoop-2.4.1-rc1/, I test the Hedged Read. I find the while loop in hedgedFetchBlockByteRange method is executed tens of thousands of times. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6591) while loop is executed tens of thousands of times in Hedged Read
[ https://issues.apache.org/jira/browse/HDFS-6591?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Nauroth updated HDFS-6591: Hadoop Flags: (was: Reviewed) while loop is executed tens of thousands of times in Hedged Read -- Key: HDFS-6591 URL: https://issues.apache.org/jira/browse/HDFS-6591 Project: Hadoop HDFS Issue Type: Bug Components: hdfs-client Affects Versions: 2.4.0 Reporter: LiuLei Assignee: Liang Xie Attachments: HDFS-6591-v2.txt, HDFS-6591-v3.txt, HDFS-6591.txt, LoopTooManyTimesTestCase.patch I download hadoop-2.4.1-rc1 code from http://people.apache.org/~acmurthy/hadoop-2.4.1-rc1/, I test the Hedged Read. I find the while loop in hedgedFetchBlockByteRange method is executed tens of thousands of times. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6527) Edit log corruption due to defered INode removal
[ https://issues.apache.org/jira/browse/HDFS-6527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14046198#comment-14046198 ] Jing Zhao commented on HDFS-6527: - Yeah, let's merge the fix back to 2.4.1 with unit test changes first. We can revisit the deferred removal in a separate jira. Edit log corruption due to defered INode removal Key: HDFS-6527 URL: https://issues.apache.org/jira/browse/HDFS-6527 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.4.0 Reporter: Kihwal Lee Assignee: Kihwal Lee Priority: Blocker Fix For: 3.0.0, 2.5.0 Attachments: HDFS-6527-addendum-test.patch, HDFS-6527.branch-2.4.patch, HDFS-6527.trunk.patch, HDFS-6527.v2.patch, HDFS-6527.v3.patch, HDFS-6527.v4.patch, HDFS-6527.v5.patch We have seen a SBN crashing with the following error: {panel} \[Edit log tailer\] ERROR namenode.FSEditLogLoader: Encountered exception on operation AddBlockOp [path=/xxx, penultimateBlock=NULL, lastBlock=blk_111_111, RpcClientId=, RpcCallId=-2] java.io.FileNotFoundException: File does not exist: /xxx {panel} This was caused by the deferred removal of deleted inodes from the inode map. Since getAdditionalBlock() acquires FSN read lock and then write lock, a deletion can happen in between. Because of deferred inode removal outside FSN write lock, getAdditionalBlock() can get the deleted inode from the inode map with FSN write lock held. This allow addition of a block to a deleted file. As a result, the edit log will contain OP_ADD, OP_DELETE, followed by OP_ADD_BLOCK. This cannot be replayed by NN, so NN doesn't start up or SBN crashes. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-2856) Fix block protocol so that Datanodes don't require root or jsvc
[ https://issues.apache.org/jira/browse/HDFS-2856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14046201#comment-14046201 ] Jitendra Nath Pandey commented on HDFS-2856: - For the specialized encrypted handshake, it seems the encrypted key is obtained from namenode via rpc for every block. That makes it now two RPC calls to namenode for every new block to write. For a given file, the key should be same and could be obtained only once? - getEncryptedStreams doesn't use access token. IMO the user and the password should be derived from the accesstoken rather than the key. - It might make sense to define the defaults for the new configuration variables in hdfs-default and/or as constants. It helps in code reading at times. - Log.debug should be wrapped inside if (Log.isDebugEnabled()) condition. - checkTrustAndSend obtains new encryption key, irrespective of the qop needed. I believe the encryption key is needed only for specialized encryption case. - SaslDataTransferClient object in NameNodeConnector.java seems out of place, the NameNodeConnector is supposed to encapsulate only namenode connections. Can we avoid the saslClient in this class? - RemotePeerFactory.java: Javadoc needs update. - Minor nit: checkTrustAndSend returns null for skipping handshake which has to be checked in the caller. It could just return the same stream pair, which otherwise every caller has to do. Fix block protocol so that Datanodes don't require root or jsvc --- Key: HDFS-2856 URL: https://issues.apache.org/jira/browse/HDFS-2856 Project: Hadoop HDFS Issue Type: Improvement Components: datanode, security Affects Versions: 3.0.0, 2.4.0 Reporter: Owen O'Malley Assignee: Chris Nauroth Attachments: Datanode-Security-Design.pdf, Datanode-Security-Design.pdf, Datanode-Security-Design.pdf, HDFS-2856-Test-Plan-1.pdf, HDFS-2856.1.patch, HDFS-2856.2.patch, HDFS-2856.3.patch, HDFS-2856.4.patch, HDFS-2856.5.patch, HDFS-2856.prototype.patch Since we send the block tokens unencrypted to the datanode, we currently start the datanode as root using jsvc and get a secure ( 1024) port. If we have the datanode generate a nonce and send it on the connection and the sends an hmac of the nonce back instead of the block token it won't reveal any secrets. Thus, we wouldn't require a secure port and would not require root or jsvc. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6601) Issues in finalizing rolling upgrade when there is a layout version change
[ https://issues.apache.org/jira/browse/HDFS-6601?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14046205#comment-14046205 ] Jing Zhao commented on HDFS-6601: - The patch looks good to me. +1. Thanks [~kihwal]! Issues in finalizing rolling upgrade when there is a layout version change -- Key: HDFS-6601 URL: https://issues.apache.org/jira/browse/HDFS-6601 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.5.0 Reporter: Kihwal Lee Assignee: Kihwal Lee Priority: Blocker Attachments: HDFS-6601.patch After HDFS-6545, we have noticed a couple of issues. - The storage dir's VERSION file is not properly updated. This becomes a problem when there is a layout version change. We can have the finalization do {{storage.writeAll()}} - {{OP_ROLLING_UPGRADE_FINALIZE}} cannot be replayed, once the corresponding {{OP_ROLLING_UPGRADE_START}} is consumed and a new fsimage is created (e.g. rollback image). On restart, NN terminates complaining it can't finalize something that it didn't start. We can make NN ignore {{OP_ROLLING_UPGRADE_FINALIZE}} if no rolling upgrade is in progress. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-5851) Support memory as a storage medium
[ https://issues.apache.org/jira/browse/HDFS-5851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14046247#comment-14046247 ] Henry Saputra commented on HDFS-5851: - HI [~sanjay.radia], I was looking at the JIRA and proposal and I have some questions related to it: 1. I did not see where the memory will be allocated for the DDM proposal. Is it similar to HDFS-4949 to use the memory from Datanode? 2. As for the APIs, would it be new Hadoop FS (Java) APIs or higher level construct to store data in memory because it seemed that the proposal only relying on file path to indicate trying to use in-memory cache 3. In the problem statement of the proposal seemed like there would be policy to manage how data should be store in memory per application but I could not find details about how to achieve it. Some applications may need to have quick access to some small portion of data more significant (eg: newer time series data) whereas some others may be need to store more (eg: large Hive query) 4. In term of discardability, what is the eviction policy for such data and how control or fine tune it if needed. Maybe it was discussed in the in-person happened before but I could not find it in the meet summary. Thanks for driving this new feature. Support memory as a storage medium -- Key: HDFS-5851 URL: https://issues.apache.org/jira/browse/HDFS-5851 Project: Hadoop HDFS Issue Type: New Feature Components: datanode Affects Versions: 3.0.0 Reporter: Arpit Agarwal Assignee: Arpit Agarwal Attachments: SupportingMemoryStorageinHDFSPersistentandDiscardableMemory.pdf, SupportingMemoryStorageinHDFSPersistentandDiscardableMemory.pdf, SupportingMemoryStorageinHDFSPersistentandDiscardableMemory.pdf Memory can be used as a storage medium for smaller/transient files for fast write throughput. More information/design will be added later. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-5851) Support memory as a storage medium
[ https://issues.apache.org/jira/browse/HDFS-5851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14046280#comment-14046280 ] Andrew Purtell commented on HDFS-5851: -- bq. In term of discardability, what is the eviction policy for such data and how control or fine tune it if needed. Related, I was talking with [~cmccabe] in the context of HDFS-4949 about possible LRU or LFU policy based eviction, and how that might work. Interesting open question of how to revoke access to mapped pages shared by the datanode with another process without causing the client process to segfault. I don't see this issue addressed in the design doc on this issue. One possibility is a callback protocol advising the client process of pending invalidations? Support memory as a storage medium -- Key: HDFS-5851 URL: https://issues.apache.org/jira/browse/HDFS-5851 Project: Hadoop HDFS Issue Type: New Feature Components: datanode Affects Versions: 3.0.0 Reporter: Arpit Agarwal Assignee: Arpit Agarwal Attachments: SupportingMemoryStorageinHDFSPersistentandDiscardableMemory.pdf, SupportingMemoryStorageinHDFSPersistentandDiscardableMemory.pdf, SupportingMemoryStorageinHDFSPersistentandDiscardableMemory.pdf Memory can be used as a storage medium for smaller/transient files for fast write throughput. More information/design will be added later. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-2856) Fix block protocol so that Datanodes don't require root or jsvc
[ https://issues.apache.org/jira/browse/HDFS-2856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14046317#comment-14046317 ] Chris Nauroth commented on HDFS-2856: - Jitendra, thank you for taking a look at this patch. bq. ...it seems the encrypted key is obtained from namenode via rpc for every block... Actually, we cache the encryption key so that we don't need to keep repeating that RPC. (This is true on current trunk and with my patch too.) The key retrieval is now wrapped behind the {{DataEncryptionKeyFactory}} interface. There are 2 implementors of this: the {{DFSClient}} itself and the {{NameNodeConnector}} used by the balancer. In both of those classes, if you look at the {{newDataEncryptionKey}} method, you'll see that they lazily fetch a key and cache it for as long as the key expiry. bq. getEncryptedStreams doesn't use access token. IMO the user and the password should be derived from the accesstoken rather than the key. Thanks for catching that. This is a private method, so I can easily remove access token from the signature. We can't change the user/password calculation for the encrypted case now without breaking compatibility. bq. It might make sense to define the defaults for the new configuration variables in hdfs-default and/or as constants. It helps in code reading at times. The patch documents the new properties dfs.data.transfer.protection and dfs.data.transfer.saslproperties.resolver.class in hdfs-default.xml. The default values are set to empty/undefined. I think this is what we want, because it's an opt-in feature. Let me know if you had any other configuration properties in mind. bq. Log.debug should be wrapped inside if (Log.isDebugEnabled()) condition. The new classes use slf4j. (There was some discussion on mailing lists a few months ago about starting to use this library in new classes.) With slf4j, it's no longer necessary to check {{isDebugEnabled}}. slf4j accepts string substitution variables using varargs, and it checks the log level internally first before doing any string concatenation. Explicitly checking {{isDebugEnabled}} wouldn't provide any performance benefit. bq. checkTrustAndSend obtains new encryption key, irrespective of the qop needed. I believe the encryption key is needed only for specialized encryption case. The 2 implementations of {{DataEncryptionKeyFactory}} mentioned above only retrieve an encryption key if encryption is enabled (NameNode is configured with dfs.encrypt.data.transfer=true). For a deployment configured with SASL on DataTransferProtocol, this will be false, so it won't actually get a key. I'll put a comment in {{SaslDataTransferClient}} to clarify this. bq. SaslDataTransferClient object in NameNodeConnector.java seems out of place, the NameNodeConnector is supposed to encapsulate only namenode connections. Can we avoid the saslClient in this class? Yeah, what was I thinking there? :-) This is needed by the balancer for its DataNode communication when it needs to move blocks. Let me see if I can move it right into the {{Balancer}} class. bq. RemotePeerFactory.java: Javadoc needs update. Will do. Thanks for the catch. bq. Minor nit: checkTrustAndSend returns null for skipping handshake which has to be checked in the caller. It could just return the same stream pair, which otherwise every caller has to do. I actually need to use null as a sentinel value. In {{peerSend}}, I need to know whether or not a SASL handshake was performed, and if so, wrap the peer in an instance of {{EncryptedPeer}} (which would be better named {{SaslPeer}} at this point, but we can refactor that later). If I returned a non-null {{IOStreamPair}} always, then I wouldn't be able to do this check. I'll get to work on a new revision that incorporates your feedback. Thanks again! Fix block protocol so that Datanodes don't require root or jsvc --- Key: HDFS-2856 URL: https://issues.apache.org/jira/browse/HDFS-2856 Project: Hadoop HDFS Issue Type: Improvement Components: datanode, security Affects Versions: 3.0.0, 2.4.0 Reporter: Owen O'Malley Assignee: Chris Nauroth Attachments: Datanode-Security-Design.pdf, Datanode-Security-Design.pdf, Datanode-Security-Design.pdf, HDFS-2856-Test-Plan-1.pdf, HDFS-2856.1.patch, HDFS-2856.2.patch, HDFS-2856.3.patch, HDFS-2856.4.patch, HDFS-2856.5.patch, HDFS-2856.prototype.patch Since we send the block tokens unencrypted to the datanode, we currently start the datanode as root using jsvc and get a secure ( 1024) port. If we have the datanode generate a nonce and send it on the connection and the sends an hmac of the nonce back instead of the block token it won't reveal any secrets. Thus, we wouldn't
[jira] [Commented] (HDFS-6389) Rename restrictions for encryption zones
[ https://issues.apache.org/jira/browse/HDFS-6389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14046352#comment-14046352 ] Colin Patrick McCabe commented on HDFS-6389: Thanks. Looks better. I would like to see tests for both rename variants. As it is, we only have tests for the two-argument version {{DistributedFileSystem#rename(Path src, Path dst)}}, since that's all the shell uses. +1 once we have tests for the other rename variant, the one that takes an array of {{Options.Rename}}. Rename restrictions for encryption zones Key: HDFS-6389 URL: https://issues.apache.org/jira/browse/HDFS-6389 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode, security Reporter: Alejandro Abdelnur Assignee: Charles Lamb Attachments: HDFS-6389.001.patch, HDFS-6389.002.patch, HDFS-6389.tests.patch Files and directories should not be moved in or out an encryption zone. -- This message was sent by Atlassian JIRA (v6.2#6252)
Re: Anyone know how to mock a secured hdfs for unit test?
Hi David and Kai, There are a couple of challenges with this, but I just figured out a pretty decent setup while working on HDFS-2856. That code isn't committed yet, but if you open patch version 5 attached to that issue and look for the TestSaslDataTransfer class, then you'll see how it works. Most of the logic for bootstrapping a MiniKDC and setting up the right HDFS configuration properties is in an abstract base class named SaslDataTransferTestCase. I hope this helps. There are a few other open issues out there related to tests in secure mode. I know of HDFS-4312 and HDFS-5410. It would be great to get more regular test coverage with something that more closely approximates a secured deployment. Chris Nauroth Hortonworks http://hortonworks.com/ On Thu, Jun 26, 2014 at 7:27 AM, Zheng, Kai kai.zh...@intel.com wrote: Hi David, Quite some time ago I opened HADOOP-9952 and planned to create secured MiniClusters by making use of MiniKDC. Unfortunately since then I didn't get the chance to work on it yet. If you need something like that and would contribute, please let me know and see if anything I can help with. Thanks. Regards, Kai -Original Message- From: Liu, David [mailto:liujion...@gmail.com] Sent: Thursday, June 26, 2014 10:12 PM To: hdfs-...@hadoop.apache.org; hdfs-issues@hadoop.apache.org; yarn-...@hadoop.apache.org; yarn-iss...@hadoop.apache.org; mapreduce-...@hadoop.apache.org; secur...@hadoop.apache.org Subject: Anyone know how to mock a secured hdfs for unit test? Hi all, I need to test my code which read data from secured hdfs, is there any library to mock secured hdfs, can minihdfscluster do the work? Any suggestion is appreciated. Thanks -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
[jira] [Updated] (HDFS-6391) Get the Key/IV from the NameNode for encrypted files in DFSClient
[ https://issues.apache.org/jira/browse/HDFS-6391?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Wang updated HDFS-6391: -- Attachment: hdfs-6391.004.patch New patch fixes the whitespace issues and I moved around imports in FSNamesystem some more too. Thanks for being thorough Charles, I'll commit this shortly. Get the Key/IV from the NameNode for encrypted files in DFSClient - Key: HDFS-6391 URL: https://issues.apache.org/jira/browse/HDFS-6391 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode, security Reporter: Alejandro Abdelnur Assignee: Andrew Wang Attachments: HDFS-6391.1.patch, hdfs-6391.002.patch, hdfs-6391.003.patch, hdfs-6391.004.patch When creating/opening and encrypted file, the DFSClient should get the encryption key material and the IV for the file in the create/open RPC call. HDFS admin users would never get key material/IV on encrypted files create/open. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (HDFS-6391) Get the Key/IV from the NameNode for encrypted files in DFSClient
[ https://issues.apache.org/jira/browse/HDFS-6391?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Wang resolved HDFS-6391. --- Resolution: Fixed Fix Version/s: fs-encryption (HADOOP-10150 and HDFS-6134) Committed to branch, put both our names on the commit. Thanks again Charles! Get the Key/IV from the NameNode for encrypted files in DFSClient - Key: HDFS-6391 URL: https://issues.apache.org/jira/browse/HDFS-6391 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode, security Reporter: Alejandro Abdelnur Assignee: Andrew Wang Fix For: fs-encryption (HADOOP-10150 and HDFS-6134) Attachments: HDFS-6391.1.patch, hdfs-6391.002.patch, hdfs-6391.003.patch, hdfs-6391.004.patch When creating/opening and encrypted file, the DFSClient should get the encryption key material and the IV for the file in the create/open RPC call. HDFS admin users would never get key material/IV on encrypted files create/open. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HDFS-6605) Client server negotiation of cipher suite
Andrew Wang created HDFS-6605: - Summary: Client server negotiation of cipher suite Key: HDFS-6605 URL: https://issues.apache.org/jira/browse/HDFS-6605 Project: Hadoop HDFS Issue Type: Sub-task Affects Versions: fs-encryption (HADOOP-10150 and HDFS-6134) Reporter: Andrew Wang Assignee: Andrew Wang For compatibility purposes, the client and server should negotiate what cipher suite to use based on their respective capabilities. This is also a way for the server to reject old clients that do not support encryption. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6605) Client server negotiation of cipher suite
[ https://issues.apache.org/jira/browse/HDFS-6605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14046386#comment-14046386 ] Andrew Wang commented on HDFS-6605: --- There was some discussion on HDFS-6391 with [~michaelbyoder] about this, I think I'll try the approach outlined there: the client presents the cipher suites it wants to use in priority order and the server chooses. This should also let us later evolve the protocol if desired. Client server negotiation of cipher suite - Key: HDFS-6605 URL: https://issues.apache.org/jira/browse/HDFS-6605 Project: Hadoop HDFS Issue Type: Sub-task Components: security Affects Versions: fs-encryption (HADOOP-10150 and HDFS-6134) Reporter: Andrew Wang Assignee: Andrew Wang For compatibility purposes, the client and server should negotiate what cipher suite to use based on their respective capabilities. This is also a way for the server to reject old clients that do not support encryption. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6389) Rename restrictions for encryption zones
[ https://issues.apache.org/jira/browse/HDFS-6389?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Charles Lamb updated HDFS-6389: --- Attachment: HDFS-6389.003.patch [~cmccabe], You read my mind. I was working on the 2nd test when you added that comment. The .003 patch adds the actual call to checkEncryptionZoneMoveValidity as well as the test. Thanks for the review! Rename restrictions for encryption zones Key: HDFS-6389 URL: https://issues.apache.org/jira/browse/HDFS-6389 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode, security Reporter: Alejandro Abdelnur Assignee: Charles Lamb Attachments: HDFS-6389.001.patch, HDFS-6389.002.patch, HDFS-6389.003.patch, HDFS-6389.tests.patch Files and directories should not be moved in or out an encryption zone. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (HDFS-6389) Rename restrictions for encryption zones
[ https://issues.apache.org/jira/browse/HDFS-6389?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Charles Lamb resolved HDFS-6389. Resolution: Fixed Fix Version/s: fs-encryption (HADOOP-10150 and HDFS-6134) [~cmccabe], thanks for the review. I've committed this to the fs-encryption branch. Rename restrictions for encryption zones Key: HDFS-6389 URL: https://issues.apache.org/jira/browse/HDFS-6389 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode, security Reporter: Alejandro Abdelnur Assignee: Charles Lamb Fix For: fs-encryption (HADOOP-10150 and HDFS-6134) Attachments: HDFS-6389.001.patch, HDFS-6389.002.patch, HDFS-6389.003.patch, HDFS-6389.tests.patch Files and directories should not be moved in or out an encryption zone. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6527) Edit log corruption due to defered INode removal
[ https://issues.apache.org/jira/browse/HDFS-6527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14046441#comment-14046441 ] Jing Zhao commented on HDFS-6527: - [~atm], do you already have a updated patch for 2.4.1? Otherwise I will try it. Edit log corruption due to defered INode removal Key: HDFS-6527 URL: https://issues.apache.org/jira/browse/HDFS-6527 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.4.0 Reporter: Kihwal Lee Assignee: Kihwal Lee Priority: Blocker Fix For: 3.0.0, 2.5.0 Attachments: HDFS-6527-addendum-test.patch, HDFS-6527.branch-2.4.patch, HDFS-6527.trunk.patch, HDFS-6527.v2.patch, HDFS-6527.v3.patch, HDFS-6527.v4.patch, HDFS-6527.v5.patch We have seen a SBN crashing with the following error: {panel} \[Edit log tailer\] ERROR namenode.FSEditLogLoader: Encountered exception on operation AddBlockOp [path=/xxx, penultimateBlock=NULL, lastBlock=blk_111_111, RpcClientId=, RpcCallId=-2] java.io.FileNotFoundException: File does not exist: /xxx {panel} This was caused by the deferred removal of deleted inodes from the inode map. Since getAdditionalBlock() acquires FSN read lock and then write lock, a deletion can happen in between. Because of deferred inode removal outside FSN write lock, getAdditionalBlock() can get the deleted inode from the inode map with FSN write lock held. This allow addition of a block to a deleted file. As a result, the edit log will contain OP_ADD, OP_DELETE, followed by OP_ADD_BLOCK. This cannot be replayed by NN, so NN doesn't start up or SBN crashes. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6441) Add ability to exclude/include few datanodes while balancing
[ https://issues.apache.org/jira/browse/HDFS-6441?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14046451#comment-14046451 ] Arpit Agarwal commented on HDFS-6441: - HDFS-6133 excludes by file path, whereas this excludes or includes specific DNs so they would be orthogonal. [~benoyantony], [~carp84], could you try and reconcile your changes so we can move forward with one of the two Jiras and dup the other? Add ability to exclude/include few datanodes while balancing Key: HDFS-6441 URL: https://issues.apache.org/jira/browse/HDFS-6441 Project: Hadoop HDFS Issue Type: Improvement Components: balancer Affects Versions: 2.4.0 Reporter: Benoy Antony Assignee: Benoy Antony Attachments: HDFS-6441.patch, HDFS-6441.patch, HDFS-6441.patch, HDFS-6441.patch, HDFS-6441.patch, HDFS-6441.patch, HDFS-6441.patch, HDFS-6441.patch, HDFS-6441.patch, HDFS-6441.patch In some use cases, it is desirable to ignore a few data nodes while balancing. The administrator should be able to specify a list of data nodes in a file similar to the hosts file and the balancer should ignore these data nodes while balancing so that no blocks are added/removed on these nodes. Similarly it will be beneficial to specify that only a particular list of datanodes should be considered for balancing. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-2856) Fix block protocol so that Datanodes don't require root or jsvc
[ https://issues.apache.org/jira/browse/HDFS-2856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Nauroth updated HDFS-2856: Attachment: HDFS-2856.6.patch Here is patch version 6. This incorporates Jitendra's feedback as I described in my last comment. Fix block protocol so that Datanodes don't require root or jsvc --- Key: HDFS-2856 URL: https://issues.apache.org/jira/browse/HDFS-2856 Project: Hadoop HDFS Issue Type: Improvement Components: datanode, security Affects Versions: 3.0.0, 2.4.0 Reporter: Owen O'Malley Assignee: Chris Nauroth Attachments: Datanode-Security-Design.pdf, Datanode-Security-Design.pdf, Datanode-Security-Design.pdf, HDFS-2856-Test-Plan-1.pdf, HDFS-2856.1.patch, HDFS-2856.2.patch, HDFS-2856.3.patch, HDFS-2856.4.patch, HDFS-2856.5.patch, HDFS-2856.6.patch, HDFS-2856.prototype.patch Since we send the block tokens unencrypted to the datanode, we currently start the datanode as root using jsvc and get a secure ( 1024) port. If we have the datanode generate a nonce and send it on the connection and the sends an hmac of the nonce back instead of the block token it won't reveal any secrets. Thus, we wouldn't require a secure port and would not require root or jsvc. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-4685) Implementation of ACLs in HDFS
[ https://issues.apache.org/jira/browse/HDFS-4685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo Nicholas Sze updated HDFS-4685: -- Assignee: Chris Nauroth (was: Andrew Wang) Implementation of ACLs in HDFS -- Key: HDFS-4685 URL: https://issues.apache.org/jira/browse/HDFS-4685 Project: Hadoop HDFS Issue Type: New Feature Components: hdfs-client, namenode, security Affects Versions: 1.1.2 Reporter: Sachin Jose Assignee: Chris Nauroth Fix For: 2.4.0 Attachments: HDFS-4685-branch-2.1.patch, HDFS-4685.1.patch, HDFS-4685.2.patch, HDFS-4685.3.patch, HDFS-4685.4.patch, HDFS-ACLs-Design-1.pdf, HDFS-ACLs-Design-2.pdf, HDFS-ACLs-Design-3.pdf, Test-Plan-for-Extended-Acls-1.pdf, Test-Plan-for-Extended-Acls-2.pdf Currenly hdfs doesn't support Extended file ACL. In unix extended ACL can be achieved using getfacl and setfacl utilities. Is there anybody working on this feature ? -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Work started] (HDFS-6604) Disk space leak with shortcircuit
[ https://issues.apache.org/jira/browse/HDFS-6604?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HDFS-6604 started by Colin Patrick McCabe. Disk space leak with shortcircuit - Key: HDFS-6604 URL: https://issues.apache.org/jira/browse/HDFS-6604 Project: Hadoop HDFS Issue Type: Bug Components: hdfs-client Affects Versions: 2.4.0 Environment: Centos 6.5 and distribution Hortonworks Data Platform v2.1 Reporter: Giuseppe Reina Assignee: Colin Patrick McCabe Priority: Critical Attachments: HDFS-6604.001.patch When HDFS shortcircuit is enabled, the file descriptors of the deleted HDFS blocks are kept open until the cache is full. This prevents the operating system to free the space on disk. More details on the [mailing list thread|http://mail-archives.apache.org/mod_mbox/hbase-user/201406.mbox/%3CCAPjB-CA3RV=slhuhwue5cv3pc4+rffz10-tkydbfs9rt2de...@mail.gmail.com%3E] -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6604) Disk space leak with shortcircuit
[ https://issues.apache.org/jira/browse/HDFS-6604?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Colin Patrick McCabe updated HDFS-6604: --- Attachment: HDFS-6604.001.patch The CacheCleaner thread is supposed to time out replicas that haven't been used for a while (5 minutes by default). It looks like this thread has not been timing out the non-mmapped entries as it should, due to a typo. This patch should fix it. We probably should also have the DN notify the client when a block that the client is reading is unlinked. It could do this via the existing shared memory segment code, in the same way we handle uncaching now. Disk space leak with shortcircuit - Key: HDFS-6604 URL: https://issues.apache.org/jira/browse/HDFS-6604 Project: Hadoop HDFS Issue Type: Bug Components: hdfs-client Affects Versions: 2.4.0 Environment: Centos 6.5 and distribution Hortonworks Data Platform v2.1 Reporter: Giuseppe Reina Priority: Critical Attachments: HDFS-6604.001.patch When HDFS shortcircuit is enabled, the file descriptors of the deleted HDFS blocks are kept open until the cache is full. This prevents the operating system to free the space on disk. More details on the [mailing list thread|http://mail-archives.apache.org/mod_mbox/hbase-user/201406.mbox/%3CCAPjB-CA3RV=slhuhwue5cv3pc4+rffz10-tkydbfs9rt2de...@mail.gmail.com%3E] -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Assigned] (HDFS-6604) Disk space leak with shortcircuit
[ https://issues.apache.org/jira/browse/HDFS-6604?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Colin Patrick McCabe reassigned HDFS-6604: -- Assignee: Colin Patrick McCabe Disk space leak with shortcircuit - Key: HDFS-6604 URL: https://issues.apache.org/jira/browse/HDFS-6604 Project: Hadoop HDFS Issue Type: Bug Components: hdfs-client Affects Versions: 2.4.0 Environment: Centos 6.5 and distribution Hortonworks Data Platform v2.1 Reporter: Giuseppe Reina Assignee: Colin Patrick McCabe Priority: Critical Attachments: HDFS-6604.001.patch When HDFS shortcircuit is enabled, the file descriptors of the deleted HDFS blocks are kept open until the cache is full. This prevents the operating system to free the space on disk. More details on the [mailing list thread|http://mail-archives.apache.org/mod_mbox/hbase-user/201406.mbox/%3CCAPjB-CA3RV=slhuhwue5cv3pc4+rffz10-tkydbfs9rt2de...@mail.gmail.com%3E] -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6604) Disk space leak with shortcircuit
[ https://issues.apache.org/jira/browse/HDFS-6604?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aaron T. Myers updated HDFS-6604: - Target Version/s: 2.5.0 Disk space leak with shortcircuit - Key: HDFS-6604 URL: https://issues.apache.org/jira/browse/HDFS-6604 Project: Hadoop HDFS Issue Type: Bug Components: hdfs-client Affects Versions: 2.4.0 Environment: Centos 6.5 and distribution Hortonworks Data Platform v2.1 Reporter: Giuseppe Reina Assignee: Colin Patrick McCabe Priority: Critical Attachments: HDFS-6604.001.patch When HDFS shortcircuit is enabled, the file descriptors of the deleted HDFS blocks are kept open until the cache is full. This prevents the operating system to free the space on disk. More details on the [mailing list thread|http://mail-archives.apache.org/mod_mbox/hbase-user/201406.mbox/%3CCAPjB-CA3RV=slhuhwue5cv3pc4+rffz10-tkydbfs9rt2de...@mail.gmail.com%3E] -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6527) Edit log corruption due to defered INode removal
[ https://issues.apache.org/jira/browse/HDFS-6527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14046521#comment-14046521 ] Aaron T. Myers commented on HDFS-6527: -- Hi Jing, I haven't currently create that patch, no. Please do feel free to go ahead and do that. Thanks a lot. Edit log corruption due to defered INode removal Key: HDFS-6527 URL: https://issues.apache.org/jira/browse/HDFS-6527 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.4.0 Reporter: Kihwal Lee Assignee: Kihwal Lee Priority: Blocker Fix For: 3.0.0, 2.5.0 Attachments: HDFS-6527-addendum-test.patch, HDFS-6527.branch-2.4.patch, HDFS-6527.trunk.patch, HDFS-6527.v2.patch, HDFS-6527.v3.patch, HDFS-6527.v4.patch, HDFS-6527.v5.patch We have seen a SBN crashing with the following error: {panel} \[Edit log tailer\] ERROR namenode.FSEditLogLoader: Encountered exception on operation AddBlockOp [path=/xxx, penultimateBlock=NULL, lastBlock=blk_111_111, RpcClientId=, RpcCallId=-2] java.io.FileNotFoundException: File does not exist: /xxx {panel} This was caused by the deferred removal of deleted inodes from the inode map. Since getAdditionalBlock() acquires FSN read lock and then write lock, a deletion can happen in between. Because of deferred inode removal outside FSN write lock, getAdditionalBlock() can get the deleted inode from the inode map with FSN write lock held. This allow addition of a block to a deleted file. As a result, the edit log will contain OP_ADD, OP_DELETE, followed by OP_ADD_BLOCK. This cannot be replayed by NN, so NN doesn't start up or SBN crashes. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6134) Transparent data at rest encryption
[ https://issues.apache.org/jira/browse/HDFS-6134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14046571#comment-14046571 ] Tsz Wo Nicholas Sze commented on HDFS-6134: --- Correct me if I am wrong: The current design does not prevent a malicious admin who has root access for a node since one can # dump the memory of a running task to find out a plain decryption key of a file; # sudo as a hdfs admin and read the encrypted file in raw format; # decrypt the file with the key. Transparent data at rest encryption --- Key: HDFS-6134 URL: https://issues.apache.org/jira/browse/HDFS-6134 Project: Hadoop HDFS Issue Type: New Feature Components: security Affects Versions: 2.3.0 Reporter: Alejandro Abdelnur Assignee: Alejandro Abdelnur Attachments: HDFSDataatRestEncryptionProposal_obsolete.pdf, HDFSEncryptionConceptualDesignProposal-2014-06-20.pdf Because of privacy and security regulations, for many industries, sensitive data at rest must be in encrypted form. For example: the healthcare industry (HIPAA regulations), the card payment industry (PCI DSS regulations) or the US government (FISMA regulations). This JIRA aims to provide a mechanism to encrypt HDFS data at rest that can be used transparently by any application accessing HDFS via Hadoop Filesystem Java API, Hadoop libhdfs C library, or WebHDFS REST API. The resulting implementation should be able to be used in compliance with different regulation requirements. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6134) Transparent data at rest encryption
[ https://issues.apache.org/jira/browse/HDFS-6134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14046575#comment-14046575 ] Alejandro Abdelnur commented on HDFS-6134: -- afaik, the only way you can protect from root attack is for all encryption/decryption to happen in sealed hardware (an HSM) and the keys never leaving such. Transparent data at rest encryption --- Key: HDFS-6134 URL: https://issues.apache.org/jira/browse/HDFS-6134 Project: Hadoop HDFS Issue Type: New Feature Components: security Affects Versions: 2.3.0 Reporter: Alejandro Abdelnur Assignee: Alejandro Abdelnur Attachments: HDFSDataatRestEncryptionProposal_obsolete.pdf, HDFSEncryptionConceptualDesignProposal-2014-06-20.pdf Because of privacy and security regulations, for many industries, sensitive data at rest must be in encrypted form. For example: the healthcare industry (HIPAA regulations), the card payment industry (PCI DSS regulations) or the US government (FISMA regulations). This JIRA aims to provide a mechanism to encrypt HDFS data at rest that can be used transparently by any application accessing HDFS via Hadoop Filesystem Java API, Hadoop libhdfs C library, or WebHDFS REST API. The resulting implementation should be able to be used in compliance with different regulation requirements. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-2856) Fix block protocol so that Datanodes don't require root or jsvc
[ https://issues.apache.org/jira/browse/HDFS-2856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14046573#comment-14046573 ] Hadoop QA commented on HDFS-2856: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12652910/HDFS-2856.6.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 7 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.server.balancer.TestBalancerWithSaslDataTransfer org.apache.hadoop.hdfs.server.balancer.TestBalancer {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/7240//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7240//console This message is automatically generated. Fix block protocol so that Datanodes don't require root or jsvc --- Key: HDFS-2856 URL: https://issues.apache.org/jira/browse/HDFS-2856 Project: Hadoop HDFS Issue Type: Improvement Components: datanode, security Affects Versions: 3.0.0, 2.4.0 Reporter: Owen O'Malley Assignee: Chris Nauroth Attachments: Datanode-Security-Design.pdf, Datanode-Security-Design.pdf, Datanode-Security-Design.pdf, HDFS-2856-Test-Plan-1.pdf, HDFS-2856.1.patch, HDFS-2856.2.patch, HDFS-2856.3.patch, HDFS-2856.4.patch, HDFS-2856.5.patch, HDFS-2856.6.patch, HDFS-2856.prototype.patch Since we send the block tokens unencrypted to the datanode, we currently start the datanode as root using jsvc and get a secure ( 1024) port. If we have the datanode generate a nonce and send it on the connection and the sends an hmac of the nonce back instead of the block token it won't reveal any secrets. Thus, we wouldn't require a secure port and would not require root or jsvc. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6134) Transparent data at rest encryption
[ https://issues.apache.org/jira/browse/HDFS-6134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14046577#comment-14046577 ] Andrew Purtell commented on HDFS-6134: -- bq. The current design does not prevent a malicious admin who has root access for a node At rest encryption doesn't address memory protection. Hence, at rest. Someone who has root access can read decrypted plaintext out of memory directly, no need for steps 2 and 3 above. It's meant to provide assurance that should a disk be improperly disposed of, or HDFS permissions be improperly set for a given set of files, that no sensitive information can leak in those cases. This is still important. It's commonly viewed as valuable (and required) in various regulatory regimes. bq. the only way you can protect from root attack is for all encryption/decryption to happen in sealed hardware (an HSM) and the keys never leaving such. You also have to somehow get your program logic onto that hardware to perform useful work on the decrypted data, and do it in a way that you can attest to the integrity of the execution environment. Transparent data at rest encryption --- Key: HDFS-6134 URL: https://issues.apache.org/jira/browse/HDFS-6134 Project: Hadoop HDFS Issue Type: New Feature Components: security Affects Versions: 2.3.0 Reporter: Alejandro Abdelnur Assignee: Alejandro Abdelnur Attachments: HDFSDataatRestEncryptionProposal_obsolete.pdf, HDFSEncryptionConceptualDesignProposal-2014-06-20.pdf Because of privacy and security regulations, for many industries, sensitive data at rest must be in encrypted form. For example: the healthcare industry (HIPAA regulations), the card payment industry (PCI DSS regulations) or the US government (FISMA regulations). This JIRA aims to provide a mechanism to encrypt HDFS data at rest that can be used transparently by any application accessing HDFS via Hadoop Filesystem Java API, Hadoop libhdfs C library, or WebHDFS REST API. The resulting implementation should be able to be used in compliance with different regulation requirements. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6134) Transparent data at rest encryption
[ https://issues.apache.org/jira/browse/HDFS-6134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14046584#comment-14046584 ] Alejandro Abdelnur commented on HDFS-6134: -- and even then, without getting the keys, root could harvest from user mem all necessary info to get the hms to decrypt anything. wondering if there is a way other than eliminating root access Transparent data at rest encryption --- Key: HDFS-6134 URL: https://issues.apache.org/jira/browse/HDFS-6134 Project: Hadoop HDFS Issue Type: New Feature Components: security Affects Versions: 2.3.0 Reporter: Alejandro Abdelnur Assignee: Alejandro Abdelnur Attachments: HDFSDataatRestEncryptionProposal_obsolete.pdf, HDFSEncryptionConceptualDesignProposal-2014-06-20.pdf Because of privacy and security regulations, for many industries, sensitive data at rest must be in encrypted form. For example: the healthcare industry (HIPAA regulations), the card payment industry (PCI DSS regulations) or the US government (FISMA regulations). This JIRA aims to provide a mechanism to encrypt HDFS data at rest that can be used transparently by any application accessing HDFS via Hadoop Filesystem Java API, Hadoop libhdfs C library, or WebHDFS REST API. The resulting implementation should be able to be used in compliance with different regulation requirements. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6441) Add ability to exclude/include few datanodes while balancing
[ https://issues.apache.org/jira/browse/HDFS-6441?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14046694#comment-14046694 ] Benoy Antony commented on HDFS-6441: I am yet to get a response from [~carp84]. Add ability to exclude/include few datanodes while balancing Key: HDFS-6441 URL: https://issues.apache.org/jira/browse/HDFS-6441 Project: Hadoop HDFS Issue Type: Improvement Components: balancer Affects Versions: 2.4.0 Reporter: Benoy Antony Assignee: Benoy Antony Attachments: HDFS-6441.patch, HDFS-6441.patch, HDFS-6441.patch, HDFS-6441.patch, HDFS-6441.patch, HDFS-6441.patch, HDFS-6441.patch, HDFS-6441.patch, HDFS-6441.patch, HDFS-6441.patch In some use cases, it is desirable to ignore a few data nodes while balancing. The administrator should be able to specify a list of data nodes in a file similar to the hosts file and the balancer should ignore these data nodes while balancing so that no blocks are added/removed on these nodes. Similarly it will be beneficial to specify that only a particular list of datanodes should be considered for balancing. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6134) Transparent data at rest encryption
[ https://issues.apache.org/jira/browse/HDFS-6134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14046711#comment-14046711 ] Todd Lipcon commented on HDFS-6134: --- I think the solution to this issue is by administrative policy - eg no single person has root access to the machine. Two admins each know half of the password, and thus neither can log in without the other one watching over their shoulder. (or some equivalent thereof using split key based access, etc). Transparent data at rest encryption --- Key: HDFS-6134 URL: https://issues.apache.org/jira/browse/HDFS-6134 Project: Hadoop HDFS Issue Type: New Feature Components: security Affects Versions: 2.3.0 Reporter: Alejandro Abdelnur Assignee: Alejandro Abdelnur Attachments: HDFSDataatRestEncryptionProposal_obsolete.pdf, HDFSEncryptionConceptualDesignProposal-2014-06-20.pdf Because of privacy and security regulations, for many industries, sensitive data at rest must be in encrypted form. For example: the healthcare industry (HIPAA regulations), the card payment industry (PCI DSS regulations) or the US government (FISMA regulations). This JIRA aims to provide a mechanism to encrypt HDFS data at rest that can be used transparently by any application accessing HDFS via Hadoop Filesystem Java API, Hadoop libhdfs C library, or WebHDFS REST API. The resulting implementation should be able to be used in compliance with different regulation requirements. -- This message was sent by Atlassian JIRA (v6.2#6252)