[jira] [Commented] (HDFS-3093) TestAllowFormat is trying to be interactive
[ https://issues.apache.org/jira/browse/HDFS-3093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13229930#comment-13229930 ] Hudson commented on HDFS-3093: -- Integrated in Hadoop-Common-0.23-Commit #681 (See [https://builds.apache.org/job/Hadoop-Common-0.23-Commit/681/]) HDFS-3093. Fix bug where namenode -format interpreted the -force flag in reverse. Contributed by Todd Lipcon. (Revision 1300813) Result = SUCCESS todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1300813 Files : * /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNode.java TestAllowFormat is trying to be interactive --- Key: HDFS-3093 URL: https://issues.apache.org/jira/browse/HDFS-3093 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 0.24.0, 0.23.3 Reporter: Todd Lipcon Assignee: Todd Lipcon Priority: Critical Fix For: 0.24.0, 0.23.3 Attachments: hdfs-3039.txt, hdfs-3093.txt, hdfs-3093.txt HDFS-2731 broke TestAllowFormat such that it now tries to prompt the user, which of course hangs forever. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3094) add -nonInteractive and -force option to namenode -format command
[ https://issues.apache.org/jira/browse/HDFS-3094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Gupta updated HDFS-3094: -- Attachment: HDFS-3094.branch-1.0.patch added another test and cleaned up comments. add -nonInteractive and -force option to namenode -format command - Key: HDFS-3094 URL: https://issues.apache.org/jira/browse/HDFS-3094 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 0.24.0, 1.0.2 Reporter: Arpit Gupta Assignee: Arpit Gupta Attachments: HDFS-3094.branch-1.0.patch, HDFS-3094.branch-1.0.patch Currently the bin/hadoop namenode -format prompts the user for a Y/N to setup the directories in the local file system. -force : namenode formats the directories without prompting -nonInterActive : namenode format will return with an exit code of 1 if the dir exists. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3093) TestAllowFormat is trying to be interactive
[ https://issues.apache.org/jira/browse/HDFS-3093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13229936#comment-13229936 ] Hudson commented on HDFS-3093: -- Integrated in Hadoop-Mapreduce-0.23-Commit #689 (See [https://builds.apache.org/job/Hadoop-Mapreduce-0.23-Commit/689/]) HDFS-3093. Fix bug where namenode -format interpreted the -force flag in reverse. Contributed by Todd Lipcon. (Revision 1300813) Result = ABORTED todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1300813 Files : * /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNode.java TestAllowFormat is trying to be interactive --- Key: HDFS-3093 URL: https://issues.apache.org/jira/browse/HDFS-3093 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 0.24.0, 0.23.3 Reporter: Todd Lipcon Assignee: Todd Lipcon Priority: Critical Fix For: 0.24.0, 0.23.3 Attachments: hdfs-3039.txt, hdfs-3093.txt, hdfs-3093.txt HDFS-2731 broke TestAllowFormat such that it now tries to prompt the user, which of course hangs forever. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3093) TestAllowFormat is trying to be interactive
[ https://issues.apache.org/jira/browse/HDFS-3093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13229935#comment-13229935 ] Hudson commented on HDFS-3093: -- Integrated in Hadoop-Mapreduce-trunk-Commit #1884 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/1884/]) HDFS-3093. Fix bug where namenode -format interpreted the -force flag in reverse. Contributed by Todd Lipcon. (Revision 1300814) Result = ABORTED todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1300814 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNode.java TestAllowFormat is trying to be interactive --- Key: HDFS-3093 URL: https://issues.apache.org/jira/browse/HDFS-3093 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 0.24.0, 0.23.3 Reporter: Todd Lipcon Assignee: Todd Lipcon Priority: Critical Fix For: 0.24.0, 0.23.3 Attachments: hdfs-3039.txt, hdfs-3093.txt, hdfs-3093.txt HDFS-2731 broke TestAllowFormat such that it now tries to prompt the user, which of course hangs forever. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HDFS-1795) Port 0.20-append changes onto 0.20-security-203
[ https://issues.apache.org/jira/browse/HDFS-1795?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eli Collins resolved HDFS-1795. --- Resolution: Fixed Fix Version/s: 0.20.205.0 Port 0.20-append changes onto 0.20-security-203 --- Key: HDFS-1795 URL: https://issues.apache.org/jira/browse/HDFS-1795 Project: Hadoop HDFS Issue Type: Task Reporter: Andrew Purtell Fix For: 0.20.205.0 Attachments: security-append-patches.zip Port 0.20-append changes onto 0.20-security-203. I started with a Git repository cloned from git://git.apache.org/hadoop-common.git . Branch 'branch-0.20-security-203' was used as the starting point for the work. I then enumerated over the 0.20-append specific patches in 'branch-0.20-append'. Each was applied if not already via cherry pick except for as noted below. This process in effect replayed the evolution of 0.20-append branch on top of 0.20-security-203. The specific functional changes that HBase absolutely relies upon are specially mentioned. Generally I ran the full test suite after each change. There were a couple of exceptions where pairs of adjacent change sets were strongly related, in which case I applied them in sequence, then ran the test suite. During this process I encountered no test failures except for one test in TestFileAppend4, a test brought in from the append branch, and I still need to dig in to see if this is a real problem or if the test needs to be changed to work on top of security-203. {noformat} commit b9ad012eaf3915c2169a02a7130b54cbcc1d8a89 Author: Dhruba Borthakur dhr...@apache.org Date: Fri Jun 4 07:20:10 2010 + HDFS-200. Support append and sync for hadoop 0.20 branch. Required for HBase commit c968e11b5a60fc6f28e4e43fbbc8a99e7e49a659 Author: Dhruba Borthakur dhr...@apache.org Date: Wed Jun 9 23:09:07 2010 + HDFS-101. DFSClient correctly detects second datanode failure in write pipeline. (Nicolas Spiegelberg via dhruba) Excluded Already in 0.20-security-203 according to search of Git change log commit 9f7e5ed2ff47444a1dcd12ed34796929d5b9f7d5 Author: Dhruba Borthakur dhr...@apache.org Date: Wed Jun 9 23:12:21 2010 + HDFS-988. Fix bug where savenameSpace can corrupt edits log. (Nicolas Spiegelberg via dhruba) commit dfbbd6fbadaa95c54a1040b4fe8854b1b858d7a5 Author: Dhruba Borthakur dhr...@apache.org Date: Thu Jun 10 18:46:03 2010 + HDFS-826. Allow a mechanism for an application to detect that datanode(s) have died in the write pipeline. (dhruba) Required for HBase Commit be8d32503d30208a2d7772b3b4b2a270938a4004 Author: Dhruba Borthakur dhr...@apache.org Date: Thu Jun 10 22:25:39 2010 + HDFS-142. Blocks that are being written by a client are stored in the blocksBeingWritten directory. (Dhruba Borthakur, Nicolas Spiegelberg, Todd Lipcon via dhruba) commit 856efc2e95aaacc597d669c1b053634ff752dbec Author: Dhruba Borthakur dhr...@apache.org Date: Fri Jun 11 00:48:41 2010 + HDFS-630. Client can exclude specific nodes in the write pipeline. (Nicolas Spiegelberg via dhruba) Required for HBase commit 2da1a05fc0cc0429229e87694977bae2ba370625 Author: Dhruba Borthakur dhr...@apache.org Date: Fri Jun 11 01:02:13 2010 + HDFS-457. Better handling of volume failure in DataNode Storage. (Nicolas Spiegelberg via dhruba) Excluded Already in 0.20-security-203 according to search of Git change log commit bd42393cd3a3a731ea98b25ddb528ad03a1ab4af Author: Dhruba Borthakur dhr...@apache.org Date: Fri Jun 11 23:37:38 2010 + HDFS-1054. remove sleep before retry for allocating a block. (Todd Lipcon via dhruba) commit 120441b9e571a5703ac39b47608e87182f0f4972 Author: Dhruba Borthakur dhr...@apache.org Date: Wed Jun 16 20:53:12 2010 + HDFS-445. pread should refetch block locations when necessary. (Todd Lipcon via dhruba) Excluded Already in 0.20-security-203 according to search of Git change log commit 2004aa453ba6b7ee2045093ba313ef8551a7f8da Author: Dhruba Borthakur dhr...@apache.org Date: Wed Jun 16 20:59:10 2010 + HDFS-561. Fix write pipeline commit 2a8227b0e6be8937fc4a654899be2a22c1f6efbe Author: Dhruba Borthakur dhr...@apache.org Date: Wed Jun 16 21:13:24 2010 + HDFS-927. DFSInputStream retries too many times for new block locations. (Todd Lipcon via dhruba) Excluded Already in 0.20-security-203 according to search of Git change log commit b1e49dbf50a429cf01b636caa2666ff81ed2a016 Author: Dhruba Borthakur dhr...@apache.org Date: Wed Jun 16 21:21:45 2010 + HDFS-1215. Fix unti test TestNodeCount. (Todd Lipcon via dhruba)
[jira] [Commented] (HDFS-3091) Failed to add new DataNode in pipeline and will be resulted into write failure.
[ https://issues.apache.org/jira/browse/HDFS-3091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13229951#comment-13229951 ] Uma Maheswara Rao G commented on HDFS-3091: --- I am thinking about this issue, I think if the *replication* is greater or equal to the total live nodes in the cluser, then we need not go for replacing the pipeline with new node. But here we may need to do RPC for checking the live nodes in the cluster. Any alternative solutions? Also other case is, if NN not able to choose any extra nodes due to DN exeiver count or other factors, then also it may fail right? In this case, i feel that sanity check itself may not be correct check right. Can't we simply proceed as normal behaviour when we are not able to find any new node from NN? Why do we need to strctily ensure one extra node and make write failure? @Nocholas, Since you are the author for this feature, I need your opinion. Failed to add new DataNode in pipeline and will be resulted into write failure. --- Key: HDFS-3091 URL: https://issues.apache.org/jira/browse/HDFS-3091 Project: Hadoop HDFS Issue Type: Bug Components: data-node, hdfs client, name-node Affects Versions: 0.23.0, 0.24.0 Reporter: Uma Maheswara Rao G When verifying the HDFS-1606 feature, Observed couple of issues. Presently the ReplaceDatanodeOnFailure policy satisfies even though we dont have enough DN to replcae in cluster and will be resulted into write failure. {quote} 12/03/13 14:27:12 WARN hdfs.DFSClient: DataStreamer Exception java.io.IOException: Failed to add a datanode: nodes.length != original.length + 1, nodes=[10.18.52.55:50010], original=[10.18.52.55:50010] at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:778) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:834) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:930) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.processDatanodeError(DFSOutputStream.java:741) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:416) {quote} Lets take some cases: 1) Replication factor 3 and cluster size also 3 and unportunately pipeline drops to 1. ReplaceDatanodeOnFailure will be satisfied because *existings(1)= replication/2 (3/2==1)*. But when it finding the new node to replace obiously it can not find the new node and the sanity check will fail. This will be resulted to Wite failure. 2) Replication factor 10 (accidentally user sets the replication factor to higher value than cluster size), Cluser has only 5 datanodes. Here even if one node fails also write will fail with same reason. Because pipeline max will be 5 and killed one datanode, then existings will be 4 *existings(4)= replication/2(10/2==5)* will be satisfied and obiously it can not replace with the new node as there is no extra nodes exist in the cluster. This will be resulted to write failure. 3) sync realted opreations also fails in this situations ( will post the clear scenarios) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3067) NPE in DFSInputStream.readBuffer if read is repeated on corrupted block
[ https://issues.apache.org/jira/browse/HDFS-3067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13229980#comment-13229980 ] Hadoop QA commented on HDFS-3067: - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12517987/HDFS-3067.1.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 eclipse:eclipse. The patch built with eclipse:eclipse. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.cli.TestHDFSCLI org.apache.hadoop.hdfs.TestDatanodeBlockScanner +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/2008//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2008//console This message is automatically generated. NPE in DFSInputStream.readBuffer if read is repeated on corrupted block --- Key: HDFS-3067 URL: https://issues.apache.org/jira/browse/HDFS-3067 Project: Hadoop HDFS Issue Type: Bug Components: hdfs client Affects Versions: 0.24.0 Reporter: Henry Robinson Assignee: Henry Robinson Attachments: HDFS-3067.1.patch, HDFS-3607.patch With a singly-replicated block that's corrupted, issuing a read against it twice in succession (e.g. if ChecksumException is caught by the client) gives a NullPointerException. Here's the body of a test that reproduces the problem: {code} final short REPL_FACTOR = 1; final long FILE_LENGTH = 512L; cluster.waitActive(); FileSystem fs = cluster.getFileSystem(); Path path = new Path(/corrupted); DFSTestUtil.createFile(fs, path, FILE_LENGTH, REPL_FACTOR, 12345L); DFSTestUtil.waitReplication(fs, path, REPL_FACTOR); ExtendedBlock block = DFSTestUtil.getFirstBlock(fs, path); int blockFilesCorrupted = cluster.corruptBlockOnDataNodes(block); assertEquals(All replicas not corrupted, REPL_FACTOR, blockFilesCorrupted); InetSocketAddress nnAddr = new InetSocketAddress(localhost, cluster.getNameNodePort()); DFSClient client = new DFSClient(nnAddr, conf); DFSInputStream dis = client.open(path.toString()); byte[] arr = new byte[(int)FILE_LENGTH]; boolean sawException = false; try { dis.read(arr, 0, (int)FILE_LENGTH); } catch (ChecksumException ex) { sawException = true; } assertTrue(sawException); sawException = false; try { dis.read(arr, 0, (int)FILE_LENGTH); // -- NPE thrown here } catch (ChecksumException ex) { sawException = true; } {code} The stack: {code} java.lang.NullPointerException at org.apache.hadoop.hdfs.DFSInputStream.readBuffer(DFSInputStream.java:492) at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:545) [snip test stack] {code} and the problem is that currentNode is null. It's left at null after the first read, which fails, and then is never refreshed because the condition in read that protects blockSeekTo is only triggered if the current position is outside the block's range. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3057) httpfs and hdfs launcher scripts should honor CATALINA_HOME and HADOOP_LIBEXEC_DIR
[ https://issues.apache.org/jira/browse/HDFS-3057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230121#comment-13230121 ] Hudson commented on HDFS-3057: -- Integrated in Hadoop-Hdfs-trunk #985 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/985/]) HDFS-3057. httpfs and hdfs launcher scripts should honor CATALINA_HOME and HADOOP_LIBEXEC_DIR (rvs via tucu) (Revision 1300637) Result = UNSTABLE tucu : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1300637 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/src/main/libexec/httpfs-config.sh * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/src/main/sbin/httpfs.sh * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt httpfs and hdfs launcher scripts should honor CATALINA_HOME and HADOOP_LIBEXEC_DIR -- Key: HDFS-3057 URL: https://issues.apache.org/jira/browse/HDFS-3057 Project: Hadoop HDFS Issue Type: Bug Components: scripts Affects Versions: 0.23.1 Reporter: Roman Shaposhnik Assignee: Roman Shaposhnik Fix For: 0.23.3 Attachments: HDFS-3057.patch.txt In sbin/httpfs.sh the following should use CATALINA_HOME: {noformat} if [ ${HTTPFS_SILENT} != true ]; then ${CATALINA_BASE:-${BASEDIR}/share/hadoop/httpfs/tomcat}/bin/catalina.sh $@ else ${CATALINA_BASE:-${BASEDIR}/share/hadoop/httpfs/tomcat}/bin/catalina.sh $@ /dev/null fi {noformat} and the following should honor HADOOP_LIBEXEC_DIR: {noformat} source ${BASEDIR}/libexec/httpfs-config.sh {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3093) TestAllowFormat is trying to be interactive
[ https://issues.apache.org/jira/browse/HDFS-3093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230123#comment-13230123 ] Hudson commented on HDFS-3093: -- Integrated in Hadoop-Hdfs-trunk #985 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/985/]) HDFS-3093. Fix bug where namenode -format interpreted the -force flag in reverse. Contributed by Todd Lipcon. (Revision 1300814) Result = UNSTABLE todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1300814 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNode.java TestAllowFormat is trying to be interactive --- Key: HDFS-3093 URL: https://issues.apache.org/jira/browse/HDFS-3093 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 0.24.0, 0.23.3 Reporter: Todd Lipcon Assignee: Todd Lipcon Priority: Critical Fix For: 0.24.0, 0.23.3 Attachments: hdfs-3039.txt, hdfs-3093.txt, hdfs-3093.txt HDFS-2731 broke TestAllowFormat such that it now tries to prompt the user, which of course hangs forever. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3057) httpfs and hdfs launcher scripts should honor CATALINA_HOME and HADOOP_LIBEXEC_DIR
[ https://issues.apache.org/jira/browse/HDFS-3057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230130#comment-13230130 ] Hudson commented on HDFS-3057: -- Integrated in Hadoop-Hdfs-0.23-Build #198 (See [https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/198/]) Merge -r 1300636:1300637 from trunk to branch. FIXES: HDFS-3057 (Revision 1300641) Result = UNSTABLE tucu : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1300641 Files : * /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs-httpfs/src/main/libexec/httpfs-config.sh * /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs-httpfs/src/main/sbin/httpfs.sh * /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt httpfs and hdfs launcher scripts should honor CATALINA_HOME and HADOOP_LIBEXEC_DIR -- Key: HDFS-3057 URL: https://issues.apache.org/jira/browse/HDFS-3057 Project: Hadoop HDFS Issue Type: Bug Components: scripts Affects Versions: 0.23.1 Reporter: Roman Shaposhnik Assignee: Roman Shaposhnik Fix For: 0.23.3 Attachments: HDFS-3057.patch.txt In sbin/httpfs.sh the following should use CATALINA_HOME: {noformat} if [ ${HTTPFS_SILENT} != true ]; then ${CATALINA_BASE:-${BASEDIR}/share/hadoop/httpfs/tomcat}/bin/catalina.sh $@ else ${CATALINA_BASE:-${BASEDIR}/share/hadoop/httpfs/tomcat}/bin/catalina.sh $@ /dev/null fi {noformat} and the following should honor HADOOP_LIBEXEC_DIR: {noformat} source ${BASEDIR}/libexec/httpfs-config.sh {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3093) TestAllowFormat is trying to be interactive
[ https://issues.apache.org/jira/browse/HDFS-3093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230132#comment-13230132 ] Hudson commented on HDFS-3093: -- Integrated in Hadoop-Hdfs-0.23-Build #198 (See [https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/198/]) HDFS-3093. Fix bug where namenode -format interpreted the -force flag in reverse. Contributed by Todd Lipcon. (Revision 1300813) Result = UNSTABLE todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1300813 Files : * /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNode.java TestAllowFormat is trying to be interactive --- Key: HDFS-3093 URL: https://issues.apache.org/jira/browse/HDFS-3093 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 0.24.0, 0.23.3 Reporter: Todd Lipcon Assignee: Todd Lipcon Priority: Critical Fix For: 0.24.0, 0.23.3 Attachments: hdfs-3039.txt, hdfs-3093.txt, hdfs-3093.txt HDFS-2731 broke TestAllowFormat such that it now tries to prompt the user, which of course hangs forever. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3057) httpfs and hdfs launcher scripts should honor CATALINA_HOME and HADOOP_LIBEXEC_DIR
[ https://issues.apache.org/jira/browse/HDFS-3057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230152#comment-13230152 ] Hudson commented on HDFS-3057: -- Integrated in Hadoop-Mapreduce-0.23-Build #226 (See [https://builds.apache.org/job/Hadoop-Mapreduce-0.23-Build/226/]) Merge -r 1300636:1300637 from trunk to branch. FIXES: HDFS-3057 (Revision 1300641) Result = FAILURE tucu : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1300641 Files : * /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs-httpfs/src/main/libexec/httpfs-config.sh * /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs-httpfs/src/main/sbin/httpfs.sh * /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt httpfs and hdfs launcher scripts should honor CATALINA_HOME and HADOOP_LIBEXEC_DIR -- Key: HDFS-3057 URL: https://issues.apache.org/jira/browse/HDFS-3057 Project: Hadoop HDFS Issue Type: Bug Components: scripts Affects Versions: 0.23.1 Reporter: Roman Shaposhnik Assignee: Roman Shaposhnik Fix For: 0.23.3 Attachments: HDFS-3057.patch.txt In sbin/httpfs.sh the following should use CATALINA_HOME: {noformat} if [ ${HTTPFS_SILENT} != true ]; then ${CATALINA_BASE:-${BASEDIR}/share/hadoop/httpfs/tomcat}/bin/catalina.sh $@ else ${CATALINA_BASE:-${BASEDIR}/share/hadoop/httpfs/tomcat}/bin/catalina.sh $@ /dev/null fi {noformat} and the following should honor HADOOP_LIBEXEC_DIR: {noformat} source ${BASEDIR}/libexec/httpfs-config.sh {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3093) TestAllowFormat is trying to be interactive
[ https://issues.apache.org/jira/browse/HDFS-3093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230154#comment-13230154 ] Hudson commented on HDFS-3093: -- Integrated in Hadoop-Mapreduce-0.23-Build #226 (See [https://builds.apache.org/job/Hadoop-Mapreduce-0.23-Build/226/]) HDFS-3093. Fix bug where namenode -format interpreted the -force flag in reverse. Contributed by Todd Lipcon. (Revision 1300813) Result = FAILURE todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1300813 Files : * /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNode.java TestAllowFormat is trying to be interactive --- Key: HDFS-3093 URL: https://issues.apache.org/jira/browse/HDFS-3093 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 0.24.0, 0.23.3 Reporter: Todd Lipcon Assignee: Todd Lipcon Priority: Critical Fix For: 0.24.0, 0.23.3 Attachments: hdfs-3039.txt, hdfs-3093.txt, hdfs-3093.txt HDFS-2731 broke TestAllowFormat such that it now tries to prompt the user, which of course hangs forever. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3057) httpfs and hdfs launcher scripts should honor CATALINA_HOME and HADOOP_LIBEXEC_DIR
[ https://issues.apache.org/jira/browse/HDFS-3057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230177#comment-13230177 ] Hudson commented on HDFS-3057: -- Integrated in Hadoop-Mapreduce-trunk #1020 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1020/]) HDFS-3057. httpfs and hdfs launcher scripts should honor CATALINA_HOME and HADOOP_LIBEXEC_DIR (rvs via tucu) (Revision 1300637) Result = FAILURE tucu : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1300637 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/src/main/libexec/httpfs-config.sh * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/src/main/sbin/httpfs.sh * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt httpfs and hdfs launcher scripts should honor CATALINA_HOME and HADOOP_LIBEXEC_DIR -- Key: HDFS-3057 URL: https://issues.apache.org/jira/browse/HDFS-3057 Project: Hadoop HDFS Issue Type: Bug Components: scripts Affects Versions: 0.23.1 Reporter: Roman Shaposhnik Assignee: Roman Shaposhnik Fix For: 0.23.3 Attachments: HDFS-3057.patch.txt In sbin/httpfs.sh the following should use CATALINA_HOME: {noformat} if [ ${HTTPFS_SILENT} != true ]; then ${CATALINA_BASE:-${BASEDIR}/share/hadoop/httpfs/tomcat}/bin/catalina.sh $@ else ${CATALINA_BASE:-${BASEDIR}/share/hadoop/httpfs/tomcat}/bin/catalina.sh $@ /dev/null fi {noformat} and the following should honor HADOOP_LIBEXEC_DIR: {noformat} source ${BASEDIR}/libexec/httpfs-config.sh {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3093) TestAllowFormat is trying to be interactive
[ https://issues.apache.org/jira/browse/HDFS-3093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230179#comment-13230179 ] Hudson commented on HDFS-3093: -- Integrated in Hadoop-Mapreduce-trunk #1020 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1020/]) HDFS-3093. Fix bug where namenode -format interpreted the -force flag in reverse. Contributed by Todd Lipcon. (Revision 1300814) Result = FAILURE todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1300814 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNode.java TestAllowFormat is trying to be interactive --- Key: HDFS-3093 URL: https://issues.apache.org/jira/browse/HDFS-3093 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 0.24.0, 0.23.3 Reporter: Todd Lipcon Assignee: Todd Lipcon Priority: Critical Fix For: 0.24.0, 0.23.3 Attachments: hdfs-3039.txt, hdfs-3093.txt, hdfs-3093.txt HDFS-2731 broke TestAllowFormat such that it now tries to prompt the user, which of course hangs forever. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3094) add -nonInteractive and -force option to namenode -format command
[ https://issues.apache.org/jira/browse/HDFS-3094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Gupta updated HDFS-3094: -- Attachment: HDFS-3094.branch-1.0.patch updated tests to check for non existence for version file when format command did not succeed. add -nonInteractive and -force option to namenode -format command - Key: HDFS-3094 URL: https://issues.apache.org/jira/browse/HDFS-3094 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 0.24.0, 1.0.2 Reporter: Arpit Gupta Assignee: Arpit Gupta Attachments: HDFS-3094.branch-1.0.patch, HDFS-3094.branch-1.0.patch, HDFS-3094.branch-1.0.patch Currently the bin/hadoop namenode -format prompts the user for a Y/N to setup the directories in the local file system. -force : namenode formats the directories without prompting -nonInterActive : namenode format will return with an exit code of 1 if the dir exists. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3067) NPE in DFSInputStream.readBuffer if read is repeated on corrupted block
[ https://issues.apache.org/jira/browse/HDFS-3067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230304#comment-13230304 ] Henry Robinson commented on HDFS-3067: -- There are two test failures: * TestHDFSCli looks like an inherited failure - it's been failing in other pre-commit builds. * TestDatanodeBlockScanner passes every time for me locally. The test result makes it look like the standard corrupt-a-block mechanism failed by hitting a timeout. Could this be environmental? NPE in DFSInputStream.readBuffer if read is repeated on corrupted block --- Key: HDFS-3067 URL: https://issues.apache.org/jira/browse/HDFS-3067 Project: Hadoop HDFS Issue Type: Bug Components: hdfs client Affects Versions: 0.24.0 Reporter: Henry Robinson Assignee: Henry Robinson Attachments: HDFS-3067.1.patch, HDFS-3607.patch With a singly-replicated block that's corrupted, issuing a read against it twice in succession (e.g. if ChecksumException is caught by the client) gives a NullPointerException. Here's the body of a test that reproduces the problem: {code} final short REPL_FACTOR = 1; final long FILE_LENGTH = 512L; cluster.waitActive(); FileSystem fs = cluster.getFileSystem(); Path path = new Path(/corrupted); DFSTestUtil.createFile(fs, path, FILE_LENGTH, REPL_FACTOR, 12345L); DFSTestUtil.waitReplication(fs, path, REPL_FACTOR); ExtendedBlock block = DFSTestUtil.getFirstBlock(fs, path); int blockFilesCorrupted = cluster.corruptBlockOnDataNodes(block); assertEquals(All replicas not corrupted, REPL_FACTOR, blockFilesCorrupted); InetSocketAddress nnAddr = new InetSocketAddress(localhost, cluster.getNameNodePort()); DFSClient client = new DFSClient(nnAddr, conf); DFSInputStream dis = client.open(path.toString()); byte[] arr = new byte[(int)FILE_LENGTH]; boolean sawException = false; try { dis.read(arr, 0, (int)FILE_LENGTH); } catch (ChecksumException ex) { sawException = true; } assertTrue(sawException); sawException = false; try { dis.read(arr, 0, (int)FILE_LENGTH); // -- NPE thrown here } catch (ChecksumException ex) { sawException = true; } {code} The stack: {code} java.lang.NullPointerException at org.apache.hadoop.hdfs.DFSInputStream.readBuffer(DFSInputStream.java:492) at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:545) [snip test stack] {code} and the problem is that currentNode is null. It's left at null after the first read, which fails, and then is never refreshed because the condition in read that protects blockSeekTo is only triggered if the current position is outside the block's range. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HDFS-3097) Use 'exec' to invoke catalina.sh in HttpFS's httpfs.sh
Use 'exec' to invoke catalina.sh in HttpFS's httpfs.sh -- Key: HDFS-3097 URL: https://issues.apache.org/jira/browse/HDFS-3097 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 0.23.1 Reporter: Herman Chen Without it shell spawns a new process, which defeats the purpose when you would like to run it in the foreground with httpfs.sh run, which eventually invokes catalina.sh run. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HDFS-3098) Update FsShell tests for quoted metachars
Update FsShell tests for quoted metachars - Key: HDFS-3098 URL: https://issues.apache.org/jira/browse/HDFS-3098 Project: Hadoop HDFS Issue Type: Test Components: test Affects Versions: 0.24.0, 0.23.2 Reporter: Daryn Sharp Assignee: Daryn Sharp Need to add tests to TestDFSShell for quoted metachars. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HDFS-3095) Namenode format should not create the storage directory if it doesn't exist
[ https://issues.apache.org/jira/browse/HDFS-3095?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brandon Li resolved HDFS-3095. -- Resolution: Invalid Release Note: Todd made a point here. HDFS user should not have write permission beyond mount points. Namenode format should not create the storage directory if it doesn't exist Key: HDFS-3095 URL: https://issues.apache.org/jira/browse/HDFS-3095 Project: Hadoop HDFS Issue Type: Bug Components: name-node Affects Versions: 0.24.0, 1.1.0 Reporter: Brandon Li Assignee: Brandon Li The storage directory can be a mount point. Automatically creating the mount point could be problematic. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3087) Decomissioning on NN restart can complete without blocks being replicated
[ https://issues.apache.org/jira/browse/HDFS-3087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230357#comment-13230357 ] Suresh Srinivas commented on HDFS-3087: --- Kihwal, this is a good bug find. We should fix this. This problem is not that serious. Prior to 0.23, we shutdown the datanode post decommission completed. After HDFS-1547 we do not shutdown the DN any more. The DN continues to shown as decommissioned. The expectation is, an Admin can at a later time shutdown the decommissioned DNs and proceed with maintenance of the node. Given this the question is, after we mark DN as decommissioned, when block report comes in, what happens? I suspect we moving back to decom in progress. How about using the flag that DatanodeDescriptor has for tracking first block report. We should not mark a DN as decommissioned, if block report is not received. I also agree that we should not be marking any thing as decommissioned, until we come out of safemode. Decomissioning on NN restart can complete without blocks being replicated - Key: HDFS-3087 URL: https://issues.apache.org/jira/browse/HDFS-3087 Project: Hadoop HDFS Issue Type: Bug Components: name-node Affects Versions: 0.23.0, 0.24.0 Reporter: Kihwal Lee Assignee: Kihwal Lee Priority: Critical Fix For: 0.23.0, 0.24.0, 0.23.2, 0.23.3 If a data node is added to the exclude list and the name node is restarted, the decomissioning happens right away on the data node registration. At this point the initial block report has not been sent, so the name node thinks the node has zero blocks and the decomissioning completes very quick, without replicating the blocks on that node. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HDFS-3099) SecondaryNameNode does not properly initialize metrics system
SecondaryNameNode does not properly initialize metrics system - Key: HDFS-3099 URL: https://issues.apache.org/jira/browse/HDFS-3099 Project: Hadoop HDFS Issue Type: Bug Components: name-node Affects Versions: 0.23.2 Reporter: Aaron T. Myers Assignee: Aaron T. Myers The SecondaryNameNode is not properly initializing its metrics system. This results in the UgiMetrics, Metrics subsystem stats, and JvmMetrics not being output. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3099) SecondaryNameNode does not properly initialize metrics system
[ https://issues.apache.org/jira/browse/HDFS-3099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aaron T. Myers updated HDFS-3099: - Status: Patch Available (was: Open) SecondaryNameNode does not properly initialize metrics system - Key: HDFS-3099 URL: https://issues.apache.org/jira/browse/HDFS-3099 Project: Hadoop HDFS Issue Type: Bug Components: name-node Affects Versions: 0.23.2 Reporter: Aaron T. Myers Assignee: Aaron T. Myers Attachments: HDFS-3099.patch The SecondaryNameNode is not properly initializing its metrics system. This results in the UgiMetrics, Metrics subsystem stats, and JvmMetrics not being output. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3099) SecondaryNameNode does not properly initialize metrics system
[ https://issues.apache.org/jira/browse/HDFS-3099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aaron T. Myers updated HDFS-3099: - Attachment: HDFS-3099.patch Here's a trivial patched which fixes the issue. I tested this manually by starting up a 2NN and browsing to /jmx. I confirmed that the expected metrics do appear with this patch, where as they do not without it. SecondaryNameNode does not properly initialize metrics system - Key: HDFS-3099 URL: https://issues.apache.org/jira/browse/HDFS-3099 Project: Hadoop HDFS Issue Type: Bug Components: name-node Affects Versions: 0.23.2 Reporter: Aaron T. Myers Assignee: Aaron T. Myers Attachments: HDFS-3099.patch The SecondaryNameNode is not properly initializing its metrics system. This results in the UgiMetrics, Metrics subsystem stats, and JvmMetrics not being output. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-1601) Pipeline ACKs are sent as lots of tiny TCP packets
[ https://issues.apache.org/jira/browse/HDFS-1601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Konstantin Shvachko updated HDFS-1601: -- Fix Version/s: 0.22.1 Pipeline ACKs are sent as lots of tiny TCP packets -- Key: HDFS-1601 URL: https://issues.apache.org/jira/browse/HDFS-1601 Project: Hadoop HDFS Issue Type: Improvement Components: data-node Affects Versions: 0.22.0 Reporter: Todd Lipcon Assignee: Todd Lipcon Fix For: 0.23.0, 0.22.1 Attachments: hdfs-1601-22.txt, hdfs-1601.txt, hdfs-1601.txt I noticed in an hbase benchmark that the packet counts in my network monitoring seemed high, so took a short pcap trace and found that each pipeline ACK was being sent as five packets, the first four of which only contain one byte. We should buffer these bytes and send the PipelineAck as one TCP packet. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3005) ConcurrentModificationException in FSDataset$FSVolume.getDfsUsed(..)
[ https://issues.apache.org/jira/browse/HDFS-3005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo (Nicholas), SZE updated HDFS-3005: - Resolution: Fixed Fix Version/s: 0.23.3 0.24.0 Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) I have committed this to trunk and 0.23. ConcurrentModificationException in FSDataset$FSVolume.getDfsUsed(..) Key: HDFS-3005 URL: https://issues.apache.org/jira/browse/HDFS-3005 Project: Hadoop HDFS Issue Type: Bug Components: data-node Affects Versions: 0.24.0 Reporter: Tsz Wo (Nicholas), SZE Assignee: Tsz Wo (Nicholas), SZE Fix For: 0.24.0, 0.23.3 Attachments: HDFS-3005.patch, h3005_20120312.patch, h3005_20120314.patch, h3005_20120314b.patch Saw this in [build #1888|https://builds.apache.org/job/PreCommit-HDFS-Build/1888//testReport/org.apache.hadoop.hdfs.server.datanode/TestMulitipleNNDataBlockScanner/testBlockScannerAfterRestart/]. {noformat} java.util.ConcurrentModificationException at java.util.HashMap$HashIterator.nextEntry(HashMap.java:793) at java.util.HashMap$EntryIterator.next(HashMap.java:834) at java.util.HashMap$EntryIterator.next(HashMap.java:832) at org.apache.hadoop.hdfs.server.datanode.FSDataset$FSVolume.getDfsUsed(FSDataset.java:557) at org.apache.hadoop.hdfs.server.datanode.FSDataset$FSVolumeSet.getDfsUsed(FSDataset.java:809) at org.apache.hadoop.hdfs.server.datanode.FSDataset$FSVolumeSet.access$1400(FSDataset.java:774) at org.apache.hadoop.hdfs.server.datanode.FSDataset.getDfsUsed(FSDataset.java:1124) at org.apache.hadoop.hdfs.server.datanode.BPOfferService.sendHeartBeat(BPOfferService.java:406) at org.apache.hadoop.hdfs.server.datanode.BPOfferService.offerService(BPOfferService.java:490) at org.apache.hadoop.hdfs.server.datanode.BPOfferService.run(BPOfferService.java:635) at java.lang.Thread.run(Thread.java:662) {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3005) ConcurrentModificationException in FSDataset$FSVolume.getDfsUsed(..)
[ https://issues.apache.org/jira/browse/HDFS-3005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230404#comment-13230404 ] Hudson commented on HDFS-3005: -- Integrated in Hadoop-Common-trunk-Commit #1878 (See [https://builds.apache.org/job/Hadoop-Common-trunk-Commit/1878/]) HDFS-3005. FSVolume.decDfsUsed(..) should be synchronized. (Revision 1301127) Result = SUCCESS szetszwo : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1301127 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/FSDataset.java ConcurrentModificationException in FSDataset$FSVolume.getDfsUsed(..) Key: HDFS-3005 URL: https://issues.apache.org/jira/browse/HDFS-3005 Project: Hadoop HDFS Issue Type: Bug Components: data-node Affects Versions: 0.24.0 Reporter: Tsz Wo (Nicholas), SZE Assignee: Tsz Wo (Nicholas), SZE Fix For: 0.24.0, 0.23.3 Attachments: HDFS-3005.patch, h3005_20120312.patch, h3005_20120314.patch, h3005_20120314b.patch Saw this in [build #1888|https://builds.apache.org/job/PreCommit-HDFS-Build/1888//testReport/org.apache.hadoop.hdfs.server.datanode/TestMulitipleNNDataBlockScanner/testBlockScannerAfterRestart/]. {noformat} java.util.ConcurrentModificationException at java.util.HashMap$HashIterator.nextEntry(HashMap.java:793) at java.util.HashMap$EntryIterator.next(HashMap.java:834) at java.util.HashMap$EntryIterator.next(HashMap.java:832) at org.apache.hadoop.hdfs.server.datanode.FSDataset$FSVolume.getDfsUsed(FSDataset.java:557) at org.apache.hadoop.hdfs.server.datanode.FSDataset$FSVolumeSet.getDfsUsed(FSDataset.java:809) at org.apache.hadoop.hdfs.server.datanode.FSDataset$FSVolumeSet.access$1400(FSDataset.java:774) at org.apache.hadoop.hdfs.server.datanode.FSDataset.getDfsUsed(FSDataset.java:1124) at org.apache.hadoop.hdfs.server.datanode.BPOfferService.sendHeartBeat(BPOfferService.java:406) at org.apache.hadoop.hdfs.server.datanode.BPOfferService.offerService(BPOfferService.java:490) at org.apache.hadoop.hdfs.server.datanode.BPOfferService.run(BPOfferService.java:635) at java.lang.Thread.run(Thread.java:662) {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3005) ConcurrentModificationException in FSDataset$FSVolume.getDfsUsed(..)
[ https://issues.apache.org/jira/browse/HDFS-3005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230409#comment-13230409 ] Hudson commented on HDFS-3005: -- Integrated in Hadoop-Common-0.23-Commit #684 (See [https://builds.apache.org/job/Hadoop-Common-0.23-Commit/684/]) svn merge -c 1301127 from trunk for HDFS-3005. (Revision 1301130) Result = SUCCESS szetszwo : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1301130 Files : * /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs * /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java * /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/FSDataset.java ConcurrentModificationException in FSDataset$FSVolume.getDfsUsed(..) Key: HDFS-3005 URL: https://issues.apache.org/jira/browse/HDFS-3005 Project: Hadoop HDFS Issue Type: Bug Components: data-node Affects Versions: 0.24.0 Reporter: Tsz Wo (Nicholas), SZE Assignee: Tsz Wo (Nicholas), SZE Fix For: 0.24.0, 0.23.3 Attachments: HDFS-3005.patch, h3005_20120312.patch, h3005_20120314.patch, h3005_20120314b.patch Saw this in [build #1888|https://builds.apache.org/job/PreCommit-HDFS-Build/1888//testReport/org.apache.hadoop.hdfs.server.datanode/TestMulitipleNNDataBlockScanner/testBlockScannerAfterRestart/]. {noformat} java.util.ConcurrentModificationException at java.util.HashMap$HashIterator.nextEntry(HashMap.java:793) at java.util.HashMap$EntryIterator.next(HashMap.java:834) at java.util.HashMap$EntryIterator.next(HashMap.java:832) at org.apache.hadoop.hdfs.server.datanode.FSDataset$FSVolume.getDfsUsed(FSDataset.java:557) at org.apache.hadoop.hdfs.server.datanode.FSDataset$FSVolumeSet.getDfsUsed(FSDataset.java:809) at org.apache.hadoop.hdfs.server.datanode.FSDataset$FSVolumeSet.access$1400(FSDataset.java:774) at org.apache.hadoop.hdfs.server.datanode.FSDataset.getDfsUsed(FSDataset.java:1124) at org.apache.hadoop.hdfs.server.datanode.BPOfferService.sendHeartBeat(BPOfferService.java:406) at org.apache.hadoop.hdfs.server.datanode.BPOfferService.offerService(BPOfferService.java:490) at org.apache.hadoop.hdfs.server.datanode.BPOfferService.run(BPOfferService.java:635) at java.lang.Thread.run(Thread.java:662) {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3005) ConcurrentModificationException in FSDataset$FSVolume.getDfsUsed(..)
[ https://issues.apache.org/jira/browse/HDFS-3005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230411#comment-13230411 ] Hudson commented on HDFS-3005: -- Integrated in Hadoop-Hdfs-0.23-Commit #675 (See [https://builds.apache.org/job/Hadoop-Hdfs-0.23-Commit/675/]) svn merge -c 1301127 from trunk for HDFS-3005. (Revision 1301130) Result = SUCCESS szetszwo : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1301130 Files : * /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs * /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java * /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/FSDataset.java ConcurrentModificationException in FSDataset$FSVolume.getDfsUsed(..) Key: HDFS-3005 URL: https://issues.apache.org/jira/browse/HDFS-3005 Project: Hadoop HDFS Issue Type: Bug Components: data-node Affects Versions: 0.24.0 Reporter: Tsz Wo (Nicholas), SZE Assignee: Tsz Wo (Nicholas), SZE Fix For: 0.24.0, 0.23.3 Attachments: HDFS-3005.patch, h3005_20120312.patch, h3005_20120314.patch, h3005_20120314b.patch Saw this in [build #1888|https://builds.apache.org/job/PreCommit-HDFS-Build/1888//testReport/org.apache.hadoop.hdfs.server.datanode/TestMulitipleNNDataBlockScanner/testBlockScannerAfterRestart/]. {noformat} java.util.ConcurrentModificationException at java.util.HashMap$HashIterator.nextEntry(HashMap.java:793) at java.util.HashMap$EntryIterator.next(HashMap.java:834) at java.util.HashMap$EntryIterator.next(HashMap.java:832) at org.apache.hadoop.hdfs.server.datanode.FSDataset$FSVolume.getDfsUsed(FSDataset.java:557) at org.apache.hadoop.hdfs.server.datanode.FSDataset$FSVolumeSet.getDfsUsed(FSDataset.java:809) at org.apache.hadoop.hdfs.server.datanode.FSDataset$FSVolumeSet.access$1400(FSDataset.java:774) at org.apache.hadoop.hdfs.server.datanode.FSDataset.getDfsUsed(FSDataset.java:1124) at org.apache.hadoop.hdfs.server.datanode.BPOfferService.sendHeartBeat(BPOfferService.java:406) at org.apache.hadoop.hdfs.server.datanode.BPOfferService.offerService(BPOfferService.java:490) at org.apache.hadoop.hdfs.server.datanode.BPOfferService.run(BPOfferService.java:635) at java.lang.Thread.run(Thread.java:662) {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3004) Implement Recovery Mode
[ https://issues.apache.org/jira/browse/HDFS-3004?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Colin Patrick McCabe updated HDFS-3004: --- Attachment: HDFS-3004.012.patch * make more exceptions skippable * rename StartupOption.ALWAYS_CHOOSE_YES to StartupOption.ALWAYS_CHOOSE_FIRST, to better reflect what it does. * refactor EditLogInputStream a bit Implement Recovery Mode --- Key: HDFS-3004 URL: https://issues.apache.org/jira/browse/HDFS-3004 Project: Hadoop HDFS Issue Type: New Feature Components: tools Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Attachments: HDFS-3004.010.patch, HDFS-3004.011.patch, HDFS-3004.012.patch, HDFS-3004__namenode_recovery_tool.txt When the NameNode metadata is corrupt for some reason, we want to be able to fix it. Obviously, we would prefer never to get in this case. In a perfect world, we never would. However, bad data on disk can happen from time to time, because of hardware errors or misconfigurations. In the past we have had to correct it manually, which is time-consuming and which can result in downtime. Recovery mode is initialized by the system administrator. When the NameNode starts up in Recovery Mode, it will try to load the FSImage file, apply all the edits from the edits log, and then write out a new image. Then it will shut down. Unlike in the normal startup process, the recovery mode startup process will be interactive. When the NameNode finds something that is inconsistent, it will prompt the operator as to what it should do. The operator can also choose to take the first option for all prompts by starting up with the '-f' flag, or typing 'a' at one of the prompts. I have reused as much code as possible from the NameNode in this tool. Hopefully, the effort that was spent developing this will also make the NameNode editLog and image processing even more robust than it already is. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3005) ConcurrentModificationException in FSDataset$FSVolume.getDfsUsed(..)
[ https://issues.apache.org/jira/browse/HDFS-3005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230413#comment-13230413 ] Hudson commented on HDFS-3005: -- Integrated in Hadoop-Hdfs-trunk-Commit #1953 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/1953/]) HDFS-3005. FSVolume.decDfsUsed(..) should be synchronized. (Revision 1301127) Result = SUCCESS szetszwo : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1301127 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/FSDataset.java ConcurrentModificationException in FSDataset$FSVolume.getDfsUsed(..) Key: HDFS-3005 URL: https://issues.apache.org/jira/browse/HDFS-3005 Project: Hadoop HDFS Issue Type: Bug Components: data-node Affects Versions: 0.24.0 Reporter: Tsz Wo (Nicholas), SZE Assignee: Tsz Wo (Nicholas), SZE Fix For: 0.24.0, 0.23.3 Attachments: HDFS-3005.patch, h3005_20120312.patch, h3005_20120314.patch, h3005_20120314b.patch Saw this in [build #1888|https://builds.apache.org/job/PreCommit-HDFS-Build/1888//testReport/org.apache.hadoop.hdfs.server.datanode/TestMulitipleNNDataBlockScanner/testBlockScannerAfterRestart/]. {noformat} java.util.ConcurrentModificationException at java.util.HashMap$HashIterator.nextEntry(HashMap.java:793) at java.util.HashMap$EntryIterator.next(HashMap.java:834) at java.util.HashMap$EntryIterator.next(HashMap.java:832) at org.apache.hadoop.hdfs.server.datanode.FSDataset$FSVolume.getDfsUsed(FSDataset.java:557) at org.apache.hadoop.hdfs.server.datanode.FSDataset$FSVolumeSet.getDfsUsed(FSDataset.java:809) at org.apache.hadoop.hdfs.server.datanode.FSDataset$FSVolumeSet.access$1400(FSDataset.java:774) at org.apache.hadoop.hdfs.server.datanode.FSDataset.getDfsUsed(FSDataset.java:1124) at org.apache.hadoop.hdfs.server.datanode.BPOfferService.sendHeartBeat(BPOfferService.java:406) at org.apache.hadoop.hdfs.server.datanode.BPOfferService.offerService(BPOfferService.java:490) at org.apache.hadoop.hdfs.server.datanode.BPOfferService.run(BPOfferService.java:635) at java.lang.Thread.run(Thread.java:662) {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3099) SecondaryNameNode does not properly initialize metrics system
[ https://issues.apache.org/jira/browse/HDFS-3099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230418#comment-13230418 ] Todd Lipcon commented on HDFS-3099: --- Any chance you could add a simple test case in TestSecondaryWebUi? Should be only a few lines. SecondaryNameNode does not properly initialize metrics system - Key: HDFS-3099 URL: https://issues.apache.org/jira/browse/HDFS-3099 Project: Hadoop HDFS Issue Type: Bug Components: name-node Affects Versions: 0.23.2 Reporter: Aaron T. Myers Assignee: Aaron T. Myers Attachments: HDFS-3099.patch The SecondaryNameNode is not properly initializing its metrics system. This results in the UgiMetrics, Metrics subsystem stats, and JvmMetrics not being output. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3067) NPE in DFSInputStream.readBuffer if read is repeated on corrupted block
[ https://issues.apache.org/jira/browse/HDFS-3067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230419#comment-13230419 ] Aaron T. Myers commented on HDFS-3067: -- I bet the test failure of TestDatanodeBlockScanner is simply HDFS-2881. I've just kicked Jenkins for this patch again to see if we can get a clean run. I agree that the TestHDFSCli failure is unrelated. NPE in DFSInputStream.readBuffer if read is repeated on corrupted block --- Key: HDFS-3067 URL: https://issues.apache.org/jira/browse/HDFS-3067 Project: Hadoop HDFS Issue Type: Bug Components: hdfs client Affects Versions: 0.24.0 Reporter: Henry Robinson Assignee: Henry Robinson Attachments: HDFS-3067.1.patch, HDFS-3607.patch With a singly-replicated block that's corrupted, issuing a read against it twice in succession (e.g. if ChecksumException is caught by the client) gives a NullPointerException. Here's the body of a test that reproduces the problem: {code} final short REPL_FACTOR = 1; final long FILE_LENGTH = 512L; cluster.waitActive(); FileSystem fs = cluster.getFileSystem(); Path path = new Path(/corrupted); DFSTestUtil.createFile(fs, path, FILE_LENGTH, REPL_FACTOR, 12345L); DFSTestUtil.waitReplication(fs, path, REPL_FACTOR); ExtendedBlock block = DFSTestUtil.getFirstBlock(fs, path); int blockFilesCorrupted = cluster.corruptBlockOnDataNodes(block); assertEquals(All replicas not corrupted, REPL_FACTOR, blockFilesCorrupted); InetSocketAddress nnAddr = new InetSocketAddress(localhost, cluster.getNameNodePort()); DFSClient client = new DFSClient(nnAddr, conf); DFSInputStream dis = client.open(path.toString()); byte[] arr = new byte[(int)FILE_LENGTH]; boolean sawException = false; try { dis.read(arr, 0, (int)FILE_LENGTH); } catch (ChecksumException ex) { sawException = true; } assertTrue(sawException); sawException = false; try { dis.read(arr, 0, (int)FILE_LENGTH); // -- NPE thrown here } catch (ChecksumException ex) { sawException = true; } {code} The stack: {code} java.lang.NullPointerException at org.apache.hadoop.hdfs.DFSInputStream.readBuffer(DFSInputStream.java:492) at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:545) [snip test stack] {code} and the problem is that currentNode is null. It's left at null after the first read, which fails, and then is never refreshed because the condition in read that protects blockSeekTo is only triggered if the current position is outside the block's range. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3091) Failed to add new DataNode in pipeline and will be resulted into write failure.
[ https://issues.apache.org/jira/browse/HDFS-3091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230426#comment-13230426 ] Tsz Wo (Nicholas), SZE commented on HDFS-3091: -- Hi Uma, First of all, thanks for testing it. I would say the failures are expected. The feature is to guarantee the number of replicas that the user is asking. However, the cluster is too small that the guarantee is impossible. It makes sense to fail the write requests. Note that policy is a client side configuration. The user could set the policy to NEVER. For the 3-node case, the admin should disable the feature (or set the policy to NEVER in the default conf.) Does it make sense to you? Failed to add new DataNode in pipeline and will be resulted into write failure. --- Key: HDFS-3091 URL: https://issues.apache.org/jira/browse/HDFS-3091 Project: Hadoop HDFS Issue Type: Bug Components: data-node, hdfs client, name-node Affects Versions: 0.23.0, 0.24.0 Reporter: Uma Maheswara Rao G When verifying the HDFS-1606 feature, Observed couple of issues. Presently the ReplaceDatanodeOnFailure policy satisfies even though we dont have enough DN to replcae in cluster and will be resulted into write failure. {quote} 12/03/13 14:27:12 WARN hdfs.DFSClient: DataStreamer Exception java.io.IOException: Failed to add a datanode: nodes.length != original.length + 1, nodes=[10.18.52.55:50010], original=[10.18.52.55:50010] at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:778) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:834) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:930) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.processDatanodeError(DFSOutputStream.java:741) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:416) {quote} Lets take some cases: 1) Replication factor 3 and cluster size also 3 and unportunately pipeline drops to 1. ReplaceDatanodeOnFailure will be satisfied because *existings(1)= replication/2 (3/2==1)*. But when it finding the new node to replace obiously it can not find the new node and the sanity check will fail. This will be resulted to Wite failure. 2) Replication factor 10 (accidentally user sets the replication factor to higher value than cluster size), Cluser has only 5 datanodes. Here even if one node fails also write will fail with same reason. Because pipeline max will be 5 and killed one datanode, then existings will be 4 *existings(4)= replication/2(10/2==5)* will be satisfied and obiously it can not replace with the new node as there is no extra nodes exist in the cluster. This will be resulted to write failure. 3) sync realted opreations also fails in this situations ( will post the clear scenarios) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3077) Quorum-based protocol for reading and writing edit logs
[ https://issues.apache.org/jira/browse/HDFS-3077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230432#comment-13230432 ] Andrew Purtell commented on HDFS-3077: -- From a user perspective. bq. [Todd] I think a quorum commit is vastly superior for HA, especially given we'd like to collocate the log replicas on machines doing other work. When those machines have latency hiccups, or crash, we don't want the active NN to have to wait for long timeout periods before continuing. I think this is a promising direction. See next: bq. [Eli] BK has two of the same main issues that we have depending on a an HA filer: (1) many users don't want to admin a separate storage system (even if you embed BK it will be discrete, fail independently etc) Perhaps we can go so far as to suggest the loggers be an additional thread added to the DataNodes. Perhaps some subset of the DN pool is elected for the purpose. (Need we waste a whole disk just for the transaction log? Maybe the log can be shared with DN storage. Or using a SSD device for this purpose seems reasonable but the average user should not be expected to have nodes with such on hand.) On the one hand, this would increase the internal complexity of the DataNode implementation, even if the functionality can be pretty well partitioned -- separate package, separate thread, etc. On the other hand, there would be not yet another moving part to consider when deploying components around the cluster: ZooKeeper quorum peers, NameNodes, DataNodes, the YARN AM, the Yarn NMs, HBase Masters, HBase RegionServers etc. etc. etc. This idea may go too far, but IMHO embedding BookKeeper goes enough in the other direction to give me heartburn thinking about HA cluster ops. Quorum-based protocol for reading and writing edit logs --- Key: HDFS-3077 URL: https://issues.apache.org/jira/browse/HDFS-3077 Project: Hadoop HDFS Issue Type: New Feature Components: ha, name-node Reporter: Todd Lipcon Assignee: Todd Lipcon Currently, one of the weak points of the HA design is that it relies on shared storage such as an NFS filer for the shared edit log. One alternative that has been proposed is to depend on BookKeeper, a ZooKeeper subproject which provides a highly available replicated edit log on commodity hardware. This JIRA is to implement another alternative, based on a quorum commit protocol, integrated more tightly in HDFS and with the requirements driven only by HDFS's needs rather than more generic use cases. More details to follow. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HDFS-3100) failed to append data using webhdfs
failed to append data using webhdfs --- Key: HDFS-3100 URL: https://issues.apache.org/jira/browse/HDFS-3100 Project: Hadoop HDFS Issue Type: Bug Components: data-node Affects Versions: 0.23.1 Reporter: Zhanwei.Wang STEP: 1, deploy a single node hdfs 0.23.1 cluster and configure hdfs as: A) enable webhdfs B) enable append C) disable permissions 2, start hdfs 3, run the test script as attached RESULT: expected: a file named testFile should be created and populated with 32K * 5000 zeros, HDFS should be OK. I got: script cannot be finished, file has been created but not be populated as expected, actually append operation failed. Datanode log shows that, blockscaner report a bad replica and nanenode decide to delete it. Since it is a single node cluster, append fail. It makes no sense that the script failed every time. Datanode and Namenode logs are attached. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3100) failed to append data using webhdfs
[ https://issues.apache.org/jira/browse/HDFS-3100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhanwei.Wang updated HDFS-3100: --- Attachment: hadoop-wangzw-namenode-ubuntu.log hadoop-wangzw-datanode-ubuntu.log test.sh test script and logs failed to append data using webhdfs --- Key: HDFS-3100 URL: https://issues.apache.org/jira/browse/HDFS-3100 Project: Hadoop HDFS Issue Type: Bug Components: data-node Affects Versions: 0.23.1 Reporter: Zhanwei.Wang Attachments: hadoop-wangzw-datanode-ubuntu.log, hadoop-wangzw-namenode-ubuntu.log, test.sh STEP: 1, deploy a single node hdfs 0.23.1 cluster and configure hdfs as: A) enable webhdfs B) enable append C) disable permissions 2, start hdfs 3, run the test script as attached RESULT: expected: a file named testFile should be created and populated with 32K * 5000 zeros, HDFS should be OK. I got: script cannot be finished, file has been created but not be populated as expected, actually append operation failed. Datanode log shows that, blockscaner report a bad replica and nanenode decide to delete it. Since it is a single node cluster, append fail. It makes no sense that the script failed every time. Datanode and Namenode logs are attached. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HDFS-3101) cannot read empty file using webhdfs
cannot read empty file using webhdfs Key: HDFS-3101 URL: https://issues.apache.org/jira/browse/HDFS-3101 Project: Hadoop HDFS Issue Type: Bug Components: hdfs client Affects Versions: 0.23.1 Reporter: Zhanwei.Wang STEP: 1, create a new EMPTY file 2, read it using webhdfs. RESULT: expected: get a empty file I got: {RemoteException:{exception:IOException,javaClassName:java.io.IOException,message:Offset=0 out of the range [0, 0); OPEN, path=/testFile}} First of all, [0, 0) is not a valid range, and I think read a empty file should be OK. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (HDFS-3101) cannot read empty file using webhdfs
[ https://issues.apache.org/jira/browse/HDFS-3101?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo (Nicholas), SZE reassigned HDFS-3101: Assignee: Tsz Wo (Nicholas), SZE cannot read empty file using webhdfs Key: HDFS-3101 URL: https://issues.apache.org/jira/browse/HDFS-3101 Project: Hadoop HDFS Issue Type: Bug Components: hdfs client Affects Versions: 0.23.1 Reporter: Zhanwei.Wang Assignee: Tsz Wo (Nicholas), SZE STEP: 1, create a new EMPTY file 2, read it using webhdfs. RESULT: expected: get a empty file I got: {RemoteException:{exception:IOException,javaClassName:java.io.IOException,message:Offset=0 out of the range [0, 0); OPEN, path=/testFile}} First of all, [0, 0) is not a valid range, and I think read a empty file should be OK. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3005) ConcurrentModificationException in FSDataset$FSVolume.getDfsUsed(..)
[ https://issues.apache.org/jira/browse/HDFS-3005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230440#comment-13230440 ] Hudson commented on HDFS-3005: -- Integrated in Hadoop-Mapreduce-0.23-Commit #692 (See [https://builds.apache.org/job/Hadoop-Mapreduce-0.23-Commit/692/]) svn merge -c 1301127 from trunk for HDFS-3005. (Revision 1301130) Result = ABORTED szetszwo : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1301130 Files : * /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs * /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java * /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/FSDataset.java ConcurrentModificationException in FSDataset$FSVolume.getDfsUsed(..) Key: HDFS-3005 URL: https://issues.apache.org/jira/browse/HDFS-3005 Project: Hadoop HDFS Issue Type: Bug Components: data-node Affects Versions: 0.24.0 Reporter: Tsz Wo (Nicholas), SZE Assignee: Tsz Wo (Nicholas), SZE Fix For: 0.24.0, 0.23.3 Attachments: HDFS-3005.patch, h3005_20120312.patch, h3005_20120314.patch, h3005_20120314b.patch Saw this in [build #1888|https://builds.apache.org/job/PreCommit-HDFS-Build/1888//testReport/org.apache.hadoop.hdfs.server.datanode/TestMulitipleNNDataBlockScanner/testBlockScannerAfterRestart/]. {noformat} java.util.ConcurrentModificationException at java.util.HashMap$HashIterator.nextEntry(HashMap.java:793) at java.util.HashMap$EntryIterator.next(HashMap.java:834) at java.util.HashMap$EntryIterator.next(HashMap.java:832) at org.apache.hadoop.hdfs.server.datanode.FSDataset$FSVolume.getDfsUsed(FSDataset.java:557) at org.apache.hadoop.hdfs.server.datanode.FSDataset$FSVolumeSet.getDfsUsed(FSDataset.java:809) at org.apache.hadoop.hdfs.server.datanode.FSDataset$FSVolumeSet.access$1400(FSDataset.java:774) at org.apache.hadoop.hdfs.server.datanode.FSDataset.getDfsUsed(FSDataset.java:1124) at org.apache.hadoop.hdfs.server.datanode.BPOfferService.sendHeartBeat(BPOfferService.java:406) at org.apache.hadoop.hdfs.server.datanode.BPOfferService.offerService(BPOfferService.java:490) at org.apache.hadoop.hdfs.server.datanode.BPOfferService.run(BPOfferService.java:635) at java.lang.Thread.run(Thread.java:662) {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3005) ConcurrentModificationException in FSDataset$FSVolume.getDfsUsed(..)
[ https://issues.apache.org/jira/browse/HDFS-3005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230439#comment-13230439 ] Hudson commented on HDFS-3005: -- Integrated in Hadoop-Mapreduce-trunk-Commit #1887 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/1887/]) HDFS-3005. FSVolume.decDfsUsed(..) should be synchronized. (Revision 1301127) Result = ABORTED szetszwo : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1301127 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/FSDataset.java ConcurrentModificationException in FSDataset$FSVolume.getDfsUsed(..) Key: HDFS-3005 URL: https://issues.apache.org/jira/browse/HDFS-3005 Project: Hadoop HDFS Issue Type: Bug Components: data-node Affects Versions: 0.24.0 Reporter: Tsz Wo (Nicholas), SZE Assignee: Tsz Wo (Nicholas), SZE Fix For: 0.24.0, 0.23.3 Attachments: HDFS-3005.patch, h3005_20120312.patch, h3005_20120314.patch, h3005_20120314b.patch Saw this in [build #1888|https://builds.apache.org/job/PreCommit-HDFS-Build/1888//testReport/org.apache.hadoop.hdfs.server.datanode/TestMulitipleNNDataBlockScanner/testBlockScannerAfterRestart/]. {noformat} java.util.ConcurrentModificationException at java.util.HashMap$HashIterator.nextEntry(HashMap.java:793) at java.util.HashMap$EntryIterator.next(HashMap.java:834) at java.util.HashMap$EntryIterator.next(HashMap.java:832) at org.apache.hadoop.hdfs.server.datanode.FSDataset$FSVolume.getDfsUsed(FSDataset.java:557) at org.apache.hadoop.hdfs.server.datanode.FSDataset$FSVolumeSet.getDfsUsed(FSDataset.java:809) at org.apache.hadoop.hdfs.server.datanode.FSDataset$FSVolumeSet.access$1400(FSDataset.java:774) at org.apache.hadoop.hdfs.server.datanode.FSDataset.getDfsUsed(FSDataset.java:1124) at org.apache.hadoop.hdfs.server.datanode.BPOfferService.sendHeartBeat(BPOfferService.java:406) at org.apache.hadoop.hdfs.server.datanode.BPOfferService.offerService(BPOfferService.java:490) at org.apache.hadoop.hdfs.server.datanode.BPOfferService.run(BPOfferService.java:635) at java.lang.Thread.run(Thread.java:662) {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (HDFS-3100) failed to append data using webhdfs
[ https://issues.apache.org/jira/browse/HDFS-3100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo (Nicholas), SZE reassigned HDFS-3100: Assignee: Tsz Wo (Nicholas), SZE failed to append data using webhdfs --- Key: HDFS-3100 URL: https://issues.apache.org/jira/browse/HDFS-3100 Project: Hadoop HDFS Issue Type: Bug Components: data-node Affects Versions: 0.23.1 Reporter: Zhanwei.Wang Assignee: Tsz Wo (Nicholas), SZE Attachments: hadoop-wangzw-datanode-ubuntu.log, hadoop-wangzw-namenode-ubuntu.log, test.sh STEP: 1, deploy a single node hdfs 0.23.1 cluster and configure hdfs as: A) enable webhdfs B) enable append C) disable permissions 2, start hdfs 3, run the test script as attached RESULT: expected: a file named testFile should be created and populated with 32K * 5000 zeros, HDFS should be OK. I got: script cannot be finished, file has been created but not be populated as expected, actually append operation failed. Datanode log shows that, blockscaner report a bad replica and nanenode decide to delete it. Since it is a single node cluster, append fail. It makes no sense that the script failed every time. Datanode and Namenode logs are attached. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3101) cannot read empty file using webhdfs
[ https://issues.apache.org/jira/browse/HDFS-3101?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo (Nicholas), SZE updated HDFS-3101: - Attachment: h3101_20120315.patch Hi Zhanwei, Good catch. Thanks a lot for filing this bug. Here is a patch h3101_20120315.patch: allow reading on zero size file. Would you mind also testing the patch? cannot read empty file using webhdfs Key: HDFS-3101 URL: https://issues.apache.org/jira/browse/HDFS-3101 Project: Hadoop HDFS Issue Type: Bug Components: hdfs client Affects Versions: 0.23.1 Reporter: Zhanwei.Wang Assignee: Tsz Wo (Nicholas), SZE Attachments: h3101_20120315.patch STEP: 1, create a new EMPTY file 2, read it using webhdfs. RESULT: expected: get a empty file I got: {RemoteException:{exception:IOException,javaClassName:java.io.IOException,message:Offset=0 out of the range [0, 0); OPEN, path=/testFile}} First of all, [0, 0) is not a valid range, and I think read a empty file should be OK. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3101) cannot read empty file using webhdfs
[ https://issues.apache.org/jira/browse/HDFS-3101?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo (Nicholas), SZE updated HDFS-3101: - Status: Patch Available (was: Open) cannot read empty file using webhdfs Key: HDFS-3101 URL: https://issues.apache.org/jira/browse/HDFS-3101 Project: Hadoop HDFS Issue Type: Bug Components: hdfs client Affects Versions: 0.23.1 Reporter: Zhanwei.Wang Assignee: Tsz Wo (Nicholas), SZE Attachments: h3101_20120315.patch STEP: 1, create a new EMPTY file 2, read it using webhdfs. RESULT: expected: get a empty file I got: {RemoteException:{exception:IOException,javaClassName:java.io.IOException,message:Offset=0 out of the range [0, 0); OPEN, path=/testFile}} First of all, [0, 0) is not a valid range, and I think read a empty file should be OK. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3098) Update FsShell tests for quoted metachars
[ https://issues.apache.org/jira/browse/HDFS-3098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daryn Sharp updated HDFS-3098: -- Attachment: HDFS-3098.patch Add tests to ensure quoted metas are taken literally. List directories with *s. Create dir with a * subdir regular subdir. Delete the * subdir. Ensure other subdir was not caught by a glob and still exists. Update FsShell tests for quoted metachars - Key: HDFS-3098 URL: https://issues.apache.org/jira/browse/HDFS-3098 Project: Hadoop HDFS Issue Type: Test Components: test Affects Versions: 0.24.0, 0.23.2 Reporter: Daryn Sharp Assignee: Daryn Sharp Attachments: HDFS-3098.patch Need to add tests to TestDFSShell for quoted metachars. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3098) Update FsShell tests for quoted metachars
[ https://issues.apache.org/jira/browse/HDFS-3098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daryn Sharp updated HDFS-3098: -- Target Version/s: 0.24.0, 0.23.2 (was: 0.23.2, 0.24.0) Status: Patch Available (was: Open) Update FsShell tests for quoted metachars - Key: HDFS-3098 URL: https://issues.apache.org/jira/browse/HDFS-3098 Project: Hadoop HDFS Issue Type: Test Components: test Affects Versions: 0.24.0, 0.23.2 Reporter: Daryn Sharp Assignee: Daryn Sharp Attachments: HDFS-3098.patch Need to add tests to TestDFSShell for quoted metachars. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3099) SecondaryNameNode does not properly initialize metrics system
[ https://issues.apache.org/jira/browse/HDFS-3099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aaron T. Myers updated HDFS-3099: - Attachment: HDFS-3099.patch Here's another patch which just adds a simple test case. SecondaryNameNode does not properly initialize metrics system - Key: HDFS-3099 URL: https://issues.apache.org/jira/browse/HDFS-3099 Project: Hadoop HDFS Issue Type: Bug Components: name-node Affects Versions: 0.23.2 Reporter: Aaron T. Myers Assignee: Aaron T. Myers Attachments: HDFS-3099.patch, HDFS-3099.patch The SecondaryNameNode is not properly initializing its metrics system. This results in the UgiMetrics, Metrics subsystem stats, and JvmMetrics not being output. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3099) SecondaryNameNode does not properly initialize metrics system
[ https://issues.apache.org/jira/browse/HDFS-3099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230499#comment-13230499 ] Todd Lipcon commented on HDFS-3099: --- You're going to slap me for making you do another rev on this, but: can you change the @Before to a @BeforeClass, so that we only use one minicluster here instead of one per case? SecondaryNameNode does not properly initialize metrics system - Key: HDFS-3099 URL: https://issues.apache.org/jira/browse/HDFS-3099 Project: Hadoop HDFS Issue Type: Bug Components: name-node Affects Versions: 0.23.2 Reporter: Aaron T. Myers Assignee: Aaron T. Myers Attachments: HDFS-3099.patch, HDFS-3099.patch The SecondaryNameNode is not properly initializing its metrics system. This results in the UgiMetrics, Metrics subsystem stats, and JvmMetrics not being output. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3094) add -nonInteractive and -force option to namenode -format command
[ https://issues.apache.org/jira/browse/HDFS-3094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Gupta updated HDFS-3094: -- Attachment: HDFS-3094.docs.patch HDFS-3094.branch-1.0.patch updated documentation for branch 1.0, attached a patch for trunk doc as its in common. add -nonInteractive and -force option to namenode -format command - Key: HDFS-3094 URL: https://issues.apache.org/jira/browse/HDFS-3094 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 0.24.0, 1.0.2 Reporter: Arpit Gupta Assignee: Arpit Gupta Attachments: HDFS-3094.branch-1.0.patch, HDFS-3094.branch-1.0.patch, HDFS-3094.branch-1.0.patch, HDFS-3094.branch-1.0.patch, HDFS-3094.docs.patch Currently the bin/hadoop namenode -format prompts the user for a Y/N to setup the directories in the local file system. -force : namenode formats the directories without prompting -nonInterActive : namenode format will return with an exit code of 1 if the dir exists. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3094) add -nonInteractive and -force option to namenode -format command
[ https://issues.apache.org/jira/browse/HDFS-3094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Gupta updated HDFS-3094: -- Status: Patch Available (was: Open) add -nonInteractive and -force option to namenode -format command - Key: HDFS-3094 URL: https://issues.apache.org/jira/browse/HDFS-3094 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 0.24.0, 1.0.2 Reporter: Arpit Gupta Assignee: Arpit Gupta Attachments: HDFS-3094.branch-1.0.patch, HDFS-3094.branch-1.0.patch, HDFS-3094.branch-1.0.patch, HDFS-3094.branch-1.0.patch, HDFS-3094.docs.patch, HDFS-3094.patch Currently the bin/hadoop namenode -format prompts the user for a Y/N to setup the directories in the local file system. -force : namenode formats the directories without prompting -nonInterActive : namenode format will return with an exit code of 1 if the dir exists. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3094) add -nonInteractive and -force option to namenode -format command
[ https://issues.apache.org/jira/browse/HDFS-3094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Gupta updated HDFS-3094: -- Attachment: HDFS-3094.patch patch for trunk, the docs patch is in a separate file HDFS-3094.docs.patch as that is in common. add -nonInteractive and -force option to namenode -format command - Key: HDFS-3094 URL: https://issues.apache.org/jira/browse/HDFS-3094 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 0.24.0, 1.0.2 Reporter: Arpit Gupta Assignee: Arpit Gupta Attachments: HDFS-3094.branch-1.0.patch, HDFS-3094.branch-1.0.patch, HDFS-3094.branch-1.0.patch, HDFS-3094.branch-1.0.patch, HDFS-3094.docs.patch, HDFS-3094.patch Currently the bin/hadoop namenode -format prompts the user for a Y/N to setup the directories in the local file system. -force : namenode formats the directories without prompting -nonInterActive : namenode format will return with an exit code of 1 if the dir exists. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3067) NPE in DFSInputStream.readBuffer if read is repeated on corrupted block
[ https://issues.apache.org/jira/browse/HDFS-3067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230509#comment-13230509 ] Hadoop QA commented on HDFS-3067: - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12517987/HDFS-3067.1.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 eclipse:eclipse. The patch built with eclipse:eclipse. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.cli.TestHDFSCLI +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/2010//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2010//console This message is automatically generated. NPE in DFSInputStream.readBuffer if read is repeated on corrupted block --- Key: HDFS-3067 URL: https://issues.apache.org/jira/browse/HDFS-3067 Project: Hadoop HDFS Issue Type: Bug Components: hdfs client Affects Versions: 0.24.0 Reporter: Henry Robinson Assignee: Henry Robinson Attachments: HDFS-3067.1.patch, HDFS-3607.patch With a singly-replicated block that's corrupted, issuing a read against it twice in succession (e.g. if ChecksumException is caught by the client) gives a NullPointerException. Here's the body of a test that reproduces the problem: {code} final short REPL_FACTOR = 1; final long FILE_LENGTH = 512L; cluster.waitActive(); FileSystem fs = cluster.getFileSystem(); Path path = new Path(/corrupted); DFSTestUtil.createFile(fs, path, FILE_LENGTH, REPL_FACTOR, 12345L); DFSTestUtil.waitReplication(fs, path, REPL_FACTOR); ExtendedBlock block = DFSTestUtil.getFirstBlock(fs, path); int blockFilesCorrupted = cluster.corruptBlockOnDataNodes(block); assertEquals(All replicas not corrupted, REPL_FACTOR, blockFilesCorrupted); InetSocketAddress nnAddr = new InetSocketAddress(localhost, cluster.getNameNodePort()); DFSClient client = new DFSClient(nnAddr, conf); DFSInputStream dis = client.open(path.toString()); byte[] arr = new byte[(int)FILE_LENGTH]; boolean sawException = false; try { dis.read(arr, 0, (int)FILE_LENGTH); } catch (ChecksumException ex) { sawException = true; } assertTrue(sawException); sawException = false; try { dis.read(arr, 0, (int)FILE_LENGTH); // -- NPE thrown here } catch (ChecksumException ex) { sawException = true; } {code} The stack: {code} java.lang.NullPointerException at org.apache.hadoop.hdfs.DFSInputStream.readBuffer(DFSInputStream.java:492) at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:545) [snip test stack] {code} and the problem is that currentNode is null. It's left at null after the first read, which fails, and then is never refreshed because the condition in read that protects blockSeekTo is only triggered if the current position is outside the block's range. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3099) SecondaryNameNode does not properly initialize metrics system
[ https://issues.apache.org/jira/browse/HDFS-3099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aaron T. Myers updated HDFS-3099: - Attachment: HDFS-3099.patch Switch to using before/after class. SecondaryNameNode does not properly initialize metrics system - Key: HDFS-3099 URL: https://issues.apache.org/jira/browse/HDFS-3099 Project: Hadoop HDFS Issue Type: Bug Components: name-node Affects Versions: 0.23.2 Reporter: Aaron T. Myers Assignee: Aaron T. Myers Attachments: HDFS-3099.patch, HDFS-3099.patch, HDFS-3099.patch The SecondaryNameNode is not properly initializing its metrics system. This results in the UgiMetrics, Metrics subsystem stats, and JvmMetrics not being output. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3100) failed to append data using webhdfs
[ https://issues.apache.org/jira/browse/HDFS-3100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo (Nicholas), SZE updated HDFS-3100: - Attachment: testAppend.patch Unfortunately, this is not specific to WebHDFS. HDFS also fails with the test. testAppend.patch: unit tests similar to Zhanwei's script. failed to append data using webhdfs --- Key: HDFS-3100 URL: https://issues.apache.org/jira/browse/HDFS-3100 Project: Hadoop HDFS Issue Type: Bug Components: data-node Affects Versions: 0.23.1 Reporter: Zhanwei.Wang Assignee: Tsz Wo (Nicholas), SZE Attachments: hadoop-wangzw-datanode-ubuntu.log, hadoop-wangzw-namenode-ubuntu.log, test.sh, testAppend.patch STEP: 1, deploy a single node hdfs 0.23.1 cluster and configure hdfs as: A) enable webhdfs B) enable append C) disable permissions 2, start hdfs 3, run the test script as attached RESULT: expected: a file named testFile should be created and populated with 32K * 5000 zeros, HDFS should be OK. I got: script cannot be finished, file has been created but not be populated as expected, actually append operation failed. Datanode log shows that, blockscaner report a bad replica and nanenode decide to delete it. Since it is a single node cluster, append fail. It makes no sense that the script failed every time. Datanode and Namenode logs are attached. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3077) Quorum-based protocol for reading and writing edit logs
[ https://issues.apache.org/jira/browse/HDFS-3077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230513#comment-13230513 ] Todd Lipcon commented on HDFS-3077: --- Hey Andrew, thanks for the ops perspective. The idea of embedding these logger daemons inside others is definitely something I'm considering. Embedding in DNs is one idea -- the other direction is to actually have a quorum of NNs, so that when an edit is logged, it is also applied to the SBN's namespace. But for simplicity on a first cut, I think the plan is to go with external processes and then figure out where best to embed them. Quorum-based protocol for reading and writing edit logs --- Key: HDFS-3077 URL: https://issues.apache.org/jira/browse/HDFS-3077 Project: Hadoop HDFS Issue Type: New Feature Components: ha, name-node Reporter: Todd Lipcon Assignee: Todd Lipcon Currently, one of the weak points of the HA design is that it relies on shared storage such as an NFS filer for the shared edit log. One alternative that has been proposed is to depend on BookKeeper, a ZooKeeper subproject which provides a highly available replicated edit log on commodity hardware. This JIRA is to implement another alternative, based on a quorum commit protocol, integrated more tightly in HDFS and with the requirements driven only by HDFS's needs rather than more generic use cases. More details to follow. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3067) NPE in DFSInputStream.readBuffer if read is repeated on corrupted block
[ https://issues.apache.org/jira/browse/HDFS-3067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230514#comment-13230514 ] Aaron T. Myers commented on HDFS-3067: -- Looks to me like the TestDatanodeBlockScanner failure was indeed unrelated. +1, the latest patch looks good to me. I'm going to commit this momentarily. NPE in DFSInputStream.readBuffer if read is repeated on corrupted block --- Key: HDFS-3067 URL: https://issues.apache.org/jira/browse/HDFS-3067 Project: Hadoop HDFS Issue Type: Bug Components: hdfs client Affects Versions: 0.24.0 Reporter: Henry Robinson Assignee: Henry Robinson Attachments: HDFS-3067.1.patch, HDFS-3607.patch With a singly-replicated block that's corrupted, issuing a read against it twice in succession (e.g. if ChecksumException is caught by the client) gives a NullPointerException. Here's the body of a test that reproduces the problem: {code} final short REPL_FACTOR = 1; final long FILE_LENGTH = 512L; cluster.waitActive(); FileSystem fs = cluster.getFileSystem(); Path path = new Path(/corrupted); DFSTestUtil.createFile(fs, path, FILE_LENGTH, REPL_FACTOR, 12345L); DFSTestUtil.waitReplication(fs, path, REPL_FACTOR); ExtendedBlock block = DFSTestUtil.getFirstBlock(fs, path); int blockFilesCorrupted = cluster.corruptBlockOnDataNodes(block); assertEquals(All replicas not corrupted, REPL_FACTOR, blockFilesCorrupted); InetSocketAddress nnAddr = new InetSocketAddress(localhost, cluster.getNameNodePort()); DFSClient client = new DFSClient(nnAddr, conf); DFSInputStream dis = client.open(path.toString()); byte[] arr = new byte[(int)FILE_LENGTH]; boolean sawException = false; try { dis.read(arr, 0, (int)FILE_LENGTH); } catch (ChecksumException ex) { sawException = true; } assertTrue(sawException); sawException = false; try { dis.read(arr, 0, (int)FILE_LENGTH); // -- NPE thrown here } catch (ChecksumException ex) { sawException = true; } {code} The stack: {code} java.lang.NullPointerException at org.apache.hadoop.hdfs.DFSInputStream.readBuffer(DFSInputStream.java:492) at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:545) [snip test stack] {code} and the problem is that currentNode is null. It's left at null after the first read, which fails, and then is never refreshed because the condition in read that protects blockSeekTo is only triggered if the current position is outside the block's range. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3067) NPE in DFSInputStream.readBuffer if read is repeated on corrupted block
[ https://issues.apache.org/jira/browse/HDFS-3067?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aaron T. Myers updated HDFS-3067: - Resolution: Fixed Fix Version/s: 0.24.0 Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) I've just committed this to trunk. Thanks a lot for the contribution, Hank. NPE in DFSInputStream.readBuffer if read is repeated on corrupted block --- Key: HDFS-3067 URL: https://issues.apache.org/jira/browse/HDFS-3067 Project: Hadoop HDFS Issue Type: Bug Components: hdfs client Affects Versions: 0.24.0 Reporter: Henry Robinson Assignee: Henry Robinson Fix For: 0.24.0 Attachments: HDFS-3067.1.patch, HDFS-3607.patch With a singly-replicated block that's corrupted, issuing a read against it twice in succession (e.g. if ChecksumException is caught by the client) gives a NullPointerException. Here's the body of a test that reproduces the problem: {code} final short REPL_FACTOR = 1; final long FILE_LENGTH = 512L; cluster.waitActive(); FileSystem fs = cluster.getFileSystem(); Path path = new Path(/corrupted); DFSTestUtil.createFile(fs, path, FILE_LENGTH, REPL_FACTOR, 12345L); DFSTestUtil.waitReplication(fs, path, REPL_FACTOR); ExtendedBlock block = DFSTestUtil.getFirstBlock(fs, path); int blockFilesCorrupted = cluster.corruptBlockOnDataNodes(block); assertEquals(All replicas not corrupted, REPL_FACTOR, blockFilesCorrupted); InetSocketAddress nnAddr = new InetSocketAddress(localhost, cluster.getNameNodePort()); DFSClient client = new DFSClient(nnAddr, conf); DFSInputStream dis = client.open(path.toString()); byte[] arr = new byte[(int)FILE_LENGTH]; boolean sawException = false; try { dis.read(arr, 0, (int)FILE_LENGTH); } catch (ChecksumException ex) { sawException = true; } assertTrue(sawException); sawException = false; try { dis.read(arr, 0, (int)FILE_LENGTH); // -- NPE thrown here } catch (ChecksumException ex) { sawException = true; } {code} The stack: {code} java.lang.NullPointerException at org.apache.hadoop.hdfs.DFSInputStream.readBuffer(DFSInputStream.java:492) at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:545) [snip test stack] {code} and the problem is that currentNode is null. It's left at null after the first read, which fails, and then is never refreshed because the condition in read that protects blockSeekTo is only triggered if the current position is outside the block's range. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3099) SecondaryNameNode does not properly initialize metrics system
[ https://issues.apache.org/jira/browse/HDFS-3099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230518#comment-13230518 ] Todd Lipcon commented on HDFS-3099: --- Excellent. +1 pending hudson SecondaryNameNode does not properly initialize metrics system - Key: HDFS-3099 URL: https://issues.apache.org/jira/browse/HDFS-3099 Project: Hadoop HDFS Issue Type: Bug Components: name-node Affects Versions: 0.23.2 Reporter: Aaron T. Myers Assignee: Aaron T. Myers Attachments: HDFS-3099.patch, HDFS-3099.patch, HDFS-3099.patch The SecondaryNameNode is not properly initializing its metrics system. This results in the UgiMetrics, Metrics subsystem stats, and JvmMetrics not being output. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3100) failed to append data using webhdfs
[ https://issues.apache.org/jira/browse/HDFS-3100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230520#comment-13230520 ] Tsz Wo (Nicholas), SZE commented on HDFS-3100: -- It looks like that the BlockPoolSliceScanner incorrectly makes the replica as corrupted. {noformat} 2012-03-15 13:26:27,317 WARN datanode.BlockPoolSliceScanner (BlockPoolSliceScanner.java:verifyBlock(419)) - First Verification failed for BP-426067686-10.10.10.105-1331843180223:blk_-951537730291424878_1083 java.io.IOException: Stream closed at java.io.BufferedInputStream.getInIfOpen(BufferedInputStream.java:134) at java.io.BufferedInputStream.fill(BufferedInputStream.java:218) at java.io.BufferedInputStream.read(BufferedInputStream.java:237) at java.io.DataInputStream.readShort(DataInputStream.java:295) at org.apache.hadoop.hdfs.server.datanode.BlockMetadataHeader.readHeader(BlockMetadataHeader.java:78) at org.apache.hadoop.hdfs.server.datanode.BlockSender.init(BlockSender.java:228) at org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceScanner.verifyBlock(BlockPoolSliceScanner.java:378) at org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceScanner.verifyFirstBlock(BlockPoolSliceScanner.java:463) at org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceScanner.scan(BlockPoolSliceScanner.java:594) at org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceScanner.scanBlockPoolSlice(BlockPoolSliceScanner.java:570) at org.apache.hadoop.hdfs.server.datanode.DataBlockScanner.run(DataBlockScanner.java:95) at java.lang.Thread.run(Thread.java:680) 2012-03-15 13:26:27,320 WARN datanode.BlockPoolSliceScanner (BlockPoolSliceScanner.java:verifyBlock(419)) - Second Verification failed for BP-426067686-10.10.10.105-1331843180223:blk_-951537730291424878_1083 java.io.IOException: Stream closed at java.io.BufferedInputStream.getInIfOpen(BufferedInputStream.java:134) at java.io.BufferedInputStream.fill(BufferedInputStream.java:218) at java.io.BufferedInputStream.read(BufferedInputStream.java:237) at java.io.DataInputStream.readShort(DataInputStream.java:295) at org.apache.hadoop.hdfs.server.datanode.BlockMetadataHeader.readHeader(BlockMetadataHeader.java:78) at org.apache.hadoop.hdfs.server.datanode.BlockSender.init(BlockSender.java:228) at org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceScanner.verifyBlock(BlockPoolSliceScanner.java:378) at org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceScanner.verifyFirstBlock(BlockPoolSliceScanner.java:463) at org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceScanner.scan(BlockPoolSliceScanner.java:594) at org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceScanner.scanBlockPoolSlice(BlockPoolSliceScanner.java:570) at org.apache.hadoop.hdfs.server.datanode.DataBlockScanner.run(DataBlockScanner.java:95) at java.lang.Thread.run(Thread.java:680) 2012-03-15 13:26:27,320 WARN datanode.BlockPoolSliceScanner (BlockPoolSliceScanner.java:addBlock(234)) - Adding an already existing block BP-426067686-10.10.10.105-1331843180223:blk_-951537730291424878_1084 2012-03-15 13:26:27,320 INFO datanode.BlockPoolSliceScanner (BlockPoolSliceScanner.java:handleScanFailure(301)) - Reporting bad block BP-426067686-10.10.10.105-1331843180223:blk_-951537730291424878_1083 2012-03-15 13:26:27,321 INFO DataNode.clienttrace (BlockReceiver.java:run(1062)) - src: /127.0.0.1:54170, dest: /127.0.0.1:54083, bytes: 84992, op: HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_2012624116_1, offset: 0, srvID: DS-600201831-10.10.10.105-54083-1331843 {noformat} failed to append data using webhdfs --- Key: HDFS-3100 URL: https://issues.apache.org/jira/browse/HDFS-3100 Project: Hadoop HDFS Issue Type: Bug Components: data-node Affects Versions: 0.23.1 Reporter: Zhanwei.Wang Assignee: Tsz Wo (Nicholas), SZE Attachments: hadoop-wangzw-datanode-ubuntu.log, hadoop-wangzw-namenode-ubuntu.log, test.sh, testAppend.patch STEP: 1, deploy a single node hdfs 0.23.1 cluster and configure hdfs as: A) enable webhdfs B) enable append C) disable permissions 2, start hdfs 3, run the test script as attached RESULT: expected: a file named testFile should be created and populated with 32K * 5000 zeros, HDFS should be OK. I got: script cannot be finished, file has been created but not be populated as expected, actually append operation failed. Datanode log shows that, blockscaner report a bad replica and nanenode decide to delete it. Since it is a single node cluster, append fail. It makes no sense that the script failed every time. Datanode and Namenode logs are attached. -- This
[jira] [Commented] (HDFS-3100) failed to append data using webhdfs
[ https://issues.apache.org/jira/browse/HDFS-3100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230519#comment-13230519 ] Tsz Wo (Nicholas), SZE commented on HDFS-3100: -- It looks like that the BlockPoolSliceScanner incorrectly makes the replica as corrupted. {noformat} 2012-03-15 13:26:27,317 WARN datanode.BlockPoolSliceScanner (BlockPoolSliceScanner.java:verifyBlock(419)) - First Verification failed for BP-426067686-10.10.10.105-1331843180223:blk_-951537730291424878_1083 java.io.IOException: Stream closed at java.io.BufferedInputStream.getInIfOpen(BufferedInputStream.java:134) at java.io.BufferedInputStream.fill(BufferedInputStream.java:218) at java.io.BufferedInputStream.read(BufferedInputStream.java:237) at java.io.DataInputStream.readShort(DataInputStream.java:295) at org.apache.hadoop.hdfs.server.datanode.BlockMetadataHeader.readHeader(BlockMetadataHeader.java:78) at org.apache.hadoop.hdfs.server.datanode.BlockSender.init(BlockSender.java:228) at org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceScanner.verifyBlock(BlockPoolSliceScanner.java:378) at org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceScanner.verifyFirstBlock(BlockPoolSliceScanner.java:463) at org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceScanner.scan(BlockPoolSliceScanner.java:594) at org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceScanner.scanBlockPoolSlice(BlockPoolSliceScanner.java:570) at org.apache.hadoop.hdfs.server.datanode.DataBlockScanner.run(DataBlockScanner.java:95) at java.lang.Thread.run(Thread.java:680) 2012-03-15 13:26:27,320 WARN datanode.BlockPoolSliceScanner (BlockPoolSliceScanner.java:verifyBlock(419)) - Second Verification failed for BP-426067686-10.10.10.105-1331843180223:blk_-951537730291424878_1083 java.io.IOException: Stream closed at java.io.BufferedInputStream.getInIfOpen(BufferedInputStream.java:134) at java.io.BufferedInputStream.fill(BufferedInputStream.java:218) at java.io.BufferedInputStream.read(BufferedInputStream.java:237) at java.io.DataInputStream.readShort(DataInputStream.java:295) at org.apache.hadoop.hdfs.server.datanode.BlockMetadataHeader.readHeader(BlockMetadataHeader.java:78) at org.apache.hadoop.hdfs.server.datanode.BlockSender.init(BlockSender.java:228) at org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceScanner.verifyBlock(BlockPoolSliceScanner.java:378) at org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceScanner.verifyFirstBlock(BlockPoolSliceScanner.java:463) at org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceScanner.scan(BlockPoolSliceScanner.java:594) at org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceScanner.scanBlockPoolSlice(BlockPoolSliceScanner.java:570) at org.apache.hadoop.hdfs.server.datanode.DataBlockScanner.run(DataBlockScanner.java:95) at java.lang.Thread.run(Thread.java:680) 2012-03-15 13:26:27,320 WARN datanode.BlockPoolSliceScanner (BlockPoolSliceScanner.java:addBlock(234)) - Adding an already existing block BP-426067686-10.10.10.105-1331843180223:blk_-951537730291424878_1084 2012-03-15 13:26:27,320 INFO datanode.BlockPoolSliceScanner (BlockPoolSliceScanner.java:handleScanFailure(301)) - Reporting bad block BP-426067686-10.10.10.105-1331843180223:blk_-951537730291424878_1083 2012-03-15 13:26:27,321 INFO DataNode.clienttrace (BlockReceiver.java:run(1062)) - src: /127.0.0.1:54170, dest: /127.0.0.1:54083, bytes: 84992, op: HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_2012624116_1, offset: 0, srvID: DS-600201831-10.10.10.105-54083-1331843 {noformat} failed to append data using webhdfs --- Key: HDFS-3100 URL: https://issues.apache.org/jira/browse/HDFS-3100 Project: Hadoop HDFS Issue Type: Bug Components: data-node Affects Versions: 0.23.1 Reporter: Zhanwei.Wang Assignee: Tsz Wo (Nicholas), SZE Attachments: hadoop-wangzw-datanode-ubuntu.log, hadoop-wangzw-namenode-ubuntu.log, test.sh, testAppend.patch STEP: 1, deploy a single node hdfs 0.23.1 cluster and configure hdfs as: A) enable webhdfs B) enable append C) disable permissions 2, start hdfs 3, run the test script as attached RESULT: expected: a file named testFile should be created and populated with 32K * 5000 zeros, HDFS should be OK. I got: script cannot be finished, file has been created but not be populated as expected, actually append operation failed. Datanode log shows that, blockscaner report a bad replica and nanenode decide to delete it. Since it is a single node cluster, append fail. It makes no sense that the script failed every time. Datanode and Namenode logs are attached. -- This
[jira] [Commented] (HDFS-3067) NPE in DFSInputStream.readBuffer if read is repeated on corrupted block
[ https://issues.apache.org/jira/browse/HDFS-3067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230525#comment-13230525 ] Hudson commented on HDFS-3067: -- Integrated in Hadoop-Hdfs-trunk-Commit #1954 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/1954/]) HDFS-3067. NPE in DFSInputStream.readBuffer if read is repeated on corrupted block. Contributed by Henry Robinson. (Revision 1301182) Result = SUCCESS atm : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1301182 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSInputStream.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSClientRetries.java NPE in DFSInputStream.readBuffer if read is repeated on corrupted block --- Key: HDFS-3067 URL: https://issues.apache.org/jira/browse/HDFS-3067 Project: Hadoop HDFS Issue Type: Bug Components: hdfs client Affects Versions: 0.24.0 Reporter: Henry Robinson Assignee: Henry Robinson Fix For: 0.24.0 Attachments: HDFS-3067.1.patch, HDFS-3607.patch With a singly-replicated block that's corrupted, issuing a read against it twice in succession (e.g. if ChecksumException is caught by the client) gives a NullPointerException. Here's the body of a test that reproduces the problem: {code} final short REPL_FACTOR = 1; final long FILE_LENGTH = 512L; cluster.waitActive(); FileSystem fs = cluster.getFileSystem(); Path path = new Path(/corrupted); DFSTestUtil.createFile(fs, path, FILE_LENGTH, REPL_FACTOR, 12345L); DFSTestUtil.waitReplication(fs, path, REPL_FACTOR); ExtendedBlock block = DFSTestUtil.getFirstBlock(fs, path); int blockFilesCorrupted = cluster.corruptBlockOnDataNodes(block); assertEquals(All replicas not corrupted, REPL_FACTOR, blockFilesCorrupted); InetSocketAddress nnAddr = new InetSocketAddress(localhost, cluster.getNameNodePort()); DFSClient client = new DFSClient(nnAddr, conf); DFSInputStream dis = client.open(path.toString()); byte[] arr = new byte[(int)FILE_LENGTH]; boolean sawException = false; try { dis.read(arr, 0, (int)FILE_LENGTH); } catch (ChecksumException ex) { sawException = true; } assertTrue(sawException); sawException = false; try { dis.read(arr, 0, (int)FILE_LENGTH); // -- NPE thrown here } catch (ChecksumException ex) { sawException = true; } {code} The stack: {code} java.lang.NullPointerException at org.apache.hadoop.hdfs.DFSInputStream.readBuffer(DFSInputStream.java:492) at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:545) [snip test stack] {code} and the problem is that currentNode is null. It's left at null after the first read, which fails, and then is never refreshed because the condition in read that protects blockSeekTo is only triggered if the current position is outside the block's range. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3067) NPE in DFSInputStream.readBuffer if read is repeated on corrupted block
[ https://issues.apache.org/jira/browse/HDFS-3067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230526#comment-13230526 ] Hudson commented on HDFS-3067: -- Integrated in Hadoop-Common-trunk-Commit #1879 (See [https://builds.apache.org/job/Hadoop-Common-trunk-Commit/1879/]) HDFS-3067. NPE in DFSInputStream.readBuffer if read is repeated on corrupted block. Contributed by Henry Robinson. (Revision 1301182) Result = SUCCESS atm : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1301182 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSInputStream.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSClientRetries.java NPE in DFSInputStream.readBuffer if read is repeated on corrupted block --- Key: HDFS-3067 URL: https://issues.apache.org/jira/browse/HDFS-3067 Project: Hadoop HDFS Issue Type: Bug Components: hdfs client Affects Versions: 0.24.0 Reporter: Henry Robinson Assignee: Henry Robinson Fix For: 0.24.0 Attachments: HDFS-3067.1.patch, HDFS-3607.patch With a singly-replicated block that's corrupted, issuing a read against it twice in succession (e.g. if ChecksumException is caught by the client) gives a NullPointerException. Here's the body of a test that reproduces the problem: {code} final short REPL_FACTOR = 1; final long FILE_LENGTH = 512L; cluster.waitActive(); FileSystem fs = cluster.getFileSystem(); Path path = new Path(/corrupted); DFSTestUtil.createFile(fs, path, FILE_LENGTH, REPL_FACTOR, 12345L); DFSTestUtil.waitReplication(fs, path, REPL_FACTOR); ExtendedBlock block = DFSTestUtil.getFirstBlock(fs, path); int blockFilesCorrupted = cluster.corruptBlockOnDataNodes(block); assertEquals(All replicas not corrupted, REPL_FACTOR, blockFilesCorrupted); InetSocketAddress nnAddr = new InetSocketAddress(localhost, cluster.getNameNodePort()); DFSClient client = new DFSClient(nnAddr, conf); DFSInputStream dis = client.open(path.toString()); byte[] arr = new byte[(int)FILE_LENGTH]; boolean sawException = false; try { dis.read(arr, 0, (int)FILE_LENGTH); } catch (ChecksumException ex) { sawException = true; } assertTrue(sawException); sawException = false; try { dis.read(arr, 0, (int)FILE_LENGTH); // -- NPE thrown here } catch (ChecksumException ex) { sawException = true; } {code} The stack: {code} java.lang.NullPointerException at org.apache.hadoop.hdfs.DFSInputStream.readBuffer(DFSInputStream.java:492) at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:545) [snip test stack] {code} and the problem is that currentNode is null. It's left at null after the first read, which fails, and then is never refreshed because the condition in read that protects blockSeekTo is only triggered if the current position is outside the block's range. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2949) HA: Add check to active state transition to prevent operator-induced split brain
[ https://issues.apache.org/jira/browse/HDFS-2949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230550#comment-13230550 ] Todd Lipcon commented on HDFS-2949: --- Another safety check here is to make sure that the transaction IDs match between the nodes before going active. HA: Add check to active state transition to prevent operator-induced split brain Key: HDFS-2949 URL: https://issues.apache.org/jira/browse/HDFS-2949 Project: Hadoop HDFS Issue Type: Improvement Components: ha, name-node Affects Versions: 0.24.0 Reporter: Todd Lipcon Currently, if the administrator mistakenly calls -transitionToActive on one NN while the other one is still active, all hell will break loose. We can add a simple check by having the NN make a getServiceState() RPC to its peer with a short (~1 second?) timeout. If the RPC succeeds and indicates the other node is active, it should refuse to enter active mode. If the RPC fails or indicates standby, it can proceed. This is just meant as a preventative safety check - we still expect users to use the -failover command which has other checks plus fencing built in. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3101) cannot read empty file using webhdfs
[ https://issues.apache.org/jira/browse/HDFS-3101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230559#comment-13230559 ] Daryn Sharp commented on HDFS-3101: --- +1 Cute edge case. Looks straightforward. cannot read empty file using webhdfs Key: HDFS-3101 URL: https://issues.apache.org/jira/browse/HDFS-3101 Project: Hadoop HDFS Issue Type: Bug Components: hdfs client Affects Versions: 0.23.1 Reporter: Zhanwei.Wang Assignee: Tsz Wo (Nicholas), SZE Attachments: h3101_20120315.patch STEP: 1, create a new EMPTY file 2, read it using webhdfs. RESULT: expected: get a empty file I got: {RemoteException:{exception:IOException,javaClassName:java.io.IOException,message:Offset=0 out of the range [0, 0); OPEN, path=/testFile}} First of all, [0, 0) is not a valid range, and I think read a empty file should be OK. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3067) NPE in DFSInputStream.readBuffer if read is repeated on corrupted block
[ https://issues.apache.org/jira/browse/HDFS-3067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230561#comment-13230561 ] Hudson commented on HDFS-3067: -- Integrated in Hadoop-Mapreduce-trunk-Commit #1888 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/1888/]) HDFS-3067. NPE in DFSInputStream.readBuffer if read is repeated on corrupted block. Contributed by Henry Robinson. (Revision 1301182) Result = ABORTED atm : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1301182 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSInputStream.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSClientRetries.java NPE in DFSInputStream.readBuffer if read is repeated on corrupted block --- Key: HDFS-3067 URL: https://issues.apache.org/jira/browse/HDFS-3067 Project: Hadoop HDFS Issue Type: Bug Components: hdfs client Affects Versions: 0.24.0 Reporter: Henry Robinson Assignee: Henry Robinson Fix For: 0.24.0 Attachments: HDFS-3067.1.patch, HDFS-3607.patch With a singly-replicated block that's corrupted, issuing a read against it twice in succession (e.g. if ChecksumException is caught by the client) gives a NullPointerException. Here's the body of a test that reproduces the problem: {code} final short REPL_FACTOR = 1; final long FILE_LENGTH = 512L; cluster.waitActive(); FileSystem fs = cluster.getFileSystem(); Path path = new Path(/corrupted); DFSTestUtil.createFile(fs, path, FILE_LENGTH, REPL_FACTOR, 12345L); DFSTestUtil.waitReplication(fs, path, REPL_FACTOR); ExtendedBlock block = DFSTestUtil.getFirstBlock(fs, path); int blockFilesCorrupted = cluster.corruptBlockOnDataNodes(block); assertEquals(All replicas not corrupted, REPL_FACTOR, blockFilesCorrupted); InetSocketAddress nnAddr = new InetSocketAddress(localhost, cluster.getNameNodePort()); DFSClient client = new DFSClient(nnAddr, conf); DFSInputStream dis = client.open(path.toString()); byte[] arr = new byte[(int)FILE_LENGTH]; boolean sawException = false; try { dis.read(arr, 0, (int)FILE_LENGTH); } catch (ChecksumException ex) { sawException = true; } assertTrue(sawException); sawException = false; try { dis.read(arr, 0, (int)FILE_LENGTH); // -- NPE thrown here } catch (ChecksumException ex) { sawException = true; } {code} The stack: {code} java.lang.NullPointerException at org.apache.hadoop.hdfs.DFSInputStream.readBuffer(DFSInputStream.java:492) at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:545) [snip test stack] {code} and the problem is that currentNode is null. It's left at null after the first read, which fails, and then is never refreshed because the condition in read that protects blockSeekTo is only triggered if the current position is outside the block's range. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HDFS-3102) Add CLI tool to initialize the shared-edits dir
Add CLI tool to initialize the shared-edits dir --- Key: HDFS-3102 URL: https://issues.apache.org/jira/browse/HDFS-3102 Project: Hadoop HDFS Issue Type: Improvement Components: ha, name-node Affects Versions: 0.24.0, 0.23.3 Reporter: Todd Lipcon Currently in order to make a non-HA NN HA, you need to initialize the shared edits dir. This can be done manually by cping directories around. It would be preferable to add a namenode -initializeSharedEdits command to achieve this same effect. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HDFS-3103) HA NN should be able to handle some cases of storage dir recovery on start
HA NN should be able to handle some cases of storage dir recovery on start -- Key: HDFS-3103 URL: https://issues.apache.org/jira/browse/HDFS-3103 Project: Hadoop HDFS Issue Type: Improvement Components: ha, name-node Affects Versions: 0.24.0, 0.23.3 Reporter: Todd Lipcon As a shortcut in developing HA, we elected not to deal with the case of storage directory recovery while HA is enabled. But there are many cases where we can and should handle it. For example, if the user configures two local dirs and one shared dir, but one of the local dirs is empty at startup, we should be able to re-format the empty dir from the other local dir. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3099) SecondaryNameNode does not properly initialize metrics system
[ https://issues.apache.org/jira/browse/HDFS-3099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230581#comment-13230581 ] Hadoop QA commented on HDFS-3099: - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12518530/HDFS-3099.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 eclipse:eclipse. The patch built with eclipse:eclipse. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.cli.TestHDFSCLI org.apache.hadoop.hdfs.server.namenode.TestValidateConfigurationSettings +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/2012//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2012//console This message is automatically generated. SecondaryNameNode does not properly initialize metrics system - Key: HDFS-3099 URL: https://issues.apache.org/jira/browse/HDFS-3099 Project: Hadoop HDFS Issue Type: Bug Components: name-node Affects Versions: 0.23.2 Reporter: Aaron T. Myers Assignee: Aaron T. Myers Attachments: HDFS-3099.patch, HDFS-3099.patch, HDFS-3099.patch The SecondaryNameNode is not properly initializing its metrics system. This results in the UgiMetrics, Metrics subsystem stats, and JvmMetrics not being output. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HDFS-3104) Add tests for mkdir -p
Add tests for mkdir -p -- Key: HDFS-3104 URL: https://issues.apache.org/jira/browse/HDFS-3104 Project: Hadoop HDFS Issue Type: Test Components: test Affects Versions: 0.24.0, 0.23.2 Reporter: Daryn Sharp Assignee: Daryn Sharp Add tests for HADOOP-8175. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3104) Add tests for mkdir -p
[ https://issues.apache.org/jira/browse/HDFS-3104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daryn Sharp updated HDFS-3104: -- Target Version/s: 0.24.0, 0.23.2 (was: 0.23.2, 0.24.0) Status: Patch Available (was: Open) Add tests for mkdir -p -- Key: HDFS-3104 URL: https://issues.apache.org/jira/browse/HDFS-3104 Project: Hadoop HDFS Issue Type: Test Components: test Affects Versions: 0.24.0, 0.23.2 Reporter: Daryn Sharp Assignee: Daryn Sharp Attachments: HDFS-3104.patch Add tests for HADOOP-8175. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3104) Add tests for mkdir -p
[ https://issues.apache.org/jira/browse/HDFS-3104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daryn Sharp updated HDFS-3104: -- Attachment: HDFS-3104.patch Add tests for mkdir -p -- Key: HDFS-3104 URL: https://issues.apache.org/jira/browse/HDFS-3104 Project: Hadoop HDFS Issue Type: Test Components: test Affects Versions: 0.24.0, 0.23.2 Reporter: Daryn Sharp Assignee: Daryn Sharp Attachments: HDFS-3104.patch Add tests for HADOOP-8175. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3099) SecondaryNameNode does not properly initialize metrics system
[ https://issues.apache.org/jira/browse/HDFS-3099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230592#comment-13230592 ] Aaron T. Myers commented on HDFS-3099: -- The test failures are unrelated to this patch. TestHDFSCLI is currently failing on trunk, and the TestValidateConfigurationSettings failure seems spurious. It just passed for me just fine on my box. I'm going to commit this momentarily. SecondaryNameNode does not properly initialize metrics system - Key: HDFS-3099 URL: https://issues.apache.org/jira/browse/HDFS-3099 Project: Hadoop HDFS Issue Type: Bug Components: name-node Affects Versions: 0.23.2 Reporter: Aaron T. Myers Assignee: Aaron T. Myers Attachments: HDFS-3099.patch, HDFS-3099.patch, HDFS-3099.patch The SecondaryNameNode is not properly initializing its metrics system. This results in the UgiMetrics, Metrics subsystem stats, and JvmMetrics not being output. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3104) Add tests for mkdir -p
[ https://issues.apache.org/jira/browse/HDFS-3104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230606#comment-13230606 ] Robert Joseph Evans commented on HDFS-3104: --- I reviewed the tests here and the corresponding source code change in HADOOP-8175. They both look good to me +1 (non-binding). Add tests for mkdir -p -- Key: HDFS-3104 URL: https://issues.apache.org/jira/browse/HDFS-3104 Project: Hadoop HDFS Issue Type: Test Components: test Affects Versions: 0.24.0, 0.23.2 Reporter: Daryn Sharp Assignee: Daryn Sharp Attachments: HDFS-3104.patch Add tests for HADOOP-8175. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3099) SecondaryNameNode does not properly initialize metrics system
[ https://issues.apache.org/jira/browse/HDFS-3099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aaron T. Myers updated HDFS-3099: - Resolution: Fixed Fix Version/s: 0.23.3 Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) I've just committed this to trunk and branch-0.23. SecondaryNameNode does not properly initialize metrics system - Key: HDFS-3099 URL: https://issues.apache.org/jira/browse/HDFS-3099 Project: Hadoop HDFS Issue Type: Bug Components: name-node Affects Versions: 0.23.2 Reporter: Aaron T. Myers Assignee: Aaron T. Myers Fix For: 0.23.3 Attachments: HDFS-3099.patch, HDFS-3099.patch, HDFS-3099.patch The SecondaryNameNode is not properly initializing its metrics system. This results in the UgiMetrics, Metrics subsystem stats, and JvmMetrics not being output. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3101) cannot read empty file using webhdfs
[ https://issues.apache.org/jira/browse/HDFS-3101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230614#comment-13230614 ] Hadoop QA commented on HDFS-3101: - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12518520/h3101_20120315.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 6 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 eclipse:eclipse. The patch built with eclipse:eclipse. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.cli.TestHDFSCLI +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/2019//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2019//console This message is automatically generated. cannot read empty file using webhdfs Key: HDFS-3101 URL: https://issues.apache.org/jira/browse/HDFS-3101 Project: Hadoop HDFS Issue Type: Bug Components: hdfs client Affects Versions: 0.23.1 Reporter: Zhanwei.Wang Assignee: Tsz Wo (Nicholas), SZE Attachments: h3101_20120315.patch STEP: 1, create a new EMPTY file 2, read it using webhdfs. RESULT: expected: get a empty file I got: {RemoteException:{exception:IOException,javaClassName:java.io.IOException,message:Offset=0 out of the range [0, 0); OPEN, path=/testFile}} First of all, [0, 0) is not a valid range, and I think read a empty file should be OK. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3099) SecondaryNameNode does not properly initialize metrics system
[ https://issues.apache.org/jira/browse/HDFS-3099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230617#comment-13230617 ] Hadoop QA commented on HDFS-3099: - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12518530/HDFS-3099.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 eclipse:eclipse. The patch built with eclipse:eclipse. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hdfs.TestDFSShell org.apache.hadoop.hdfs.TestDFSClientRetries org.apache.hadoop.cli.TestHDFSCLI +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/2018//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2018//console This message is automatically generated. SecondaryNameNode does not properly initialize metrics system - Key: HDFS-3099 URL: https://issues.apache.org/jira/browse/HDFS-3099 Project: Hadoop HDFS Issue Type: Bug Components: name-node Affects Versions: 0.23.2 Reporter: Aaron T. Myers Assignee: Aaron T. Myers Fix For: 0.23.3 Attachments: HDFS-3099.patch, HDFS-3099.patch, HDFS-3099.patch The SecondaryNameNode is not properly initializing its metrics system. This results in the UgiMetrics, Metrics subsystem stats, and JvmMetrics not being output. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3004) Implement Recovery Mode
[ https://issues.apache.org/jira/browse/HDFS-3004?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Colin Patrick McCabe updated HDFS-3004: --- Attachment: HDFS-3004.013.patch * remove SkippableEditLogException, as it turned out not to be necessary * test skipping in EditLogInputStream Implement Recovery Mode --- Key: HDFS-3004 URL: https://issues.apache.org/jira/browse/HDFS-3004 Project: Hadoop HDFS Issue Type: New Feature Components: tools Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Attachments: HDFS-3004.010.patch, HDFS-3004.011.patch, HDFS-3004.012.patch, HDFS-3004.013.patch, HDFS-3004__namenode_recovery_tool.txt When the NameNode metadata is corrupt for some reason, we want to be able to fix it. Obviously, we would prefer never to get in this case. In a perfect world, we never would. However, bad data on disk can happen from time to time, because of hardware errors or misconfigurations. In the past we have had to correct it manually, which is time-consuming and which can result in downtime. Recovery mode is initialized by the system administrator. When the NameNode starts up in Recovery Mode, it will try to load the FSImage file, apply all the edits from the edits log, and then write out a new image. Then it will shut down. Unlike in the normal startup process, the recovery mode startup process will be interactive. When the NameNode finds something that is inconsistent, it will prompt the operator as to what it should do. The operator can also choose to take the first option for all prompts by starting up with the '-f' flag, or typing 'a' at one of the prompts. I have reused as much code as possible from the NameNode in this tool. Hopefully, the effort that was spent developing this will also make the NameNode editLog and image processing even more robust than it already is. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3099) SecondaryNameNode does not properly initialize metrics system
[ https://issues.apache.org/jira/browse/HDFS-3099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230630#comment-13230630 ] Hudson commented on HDFS-3099: -- Integrated in Hadoop-Common-trunk-Commit #1880 (See [https://builds.apache.org/job/Hadoop-Common-trunk-Commit/1880/]) HDFS-3099. SecondaryNameNode does not properly initialize metrics system. Contributed by Aaron T. Myers. (Revision 1301222) Result = SUCCESS atm : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1301222 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/SecondaryNameNode.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestSecondaryWebUi.java SecondaryNameNode does not properly initialize metrics system - Key: HDFS-3099 URL: https://issues.apache.org/jira/browse/HDFS-3099 Project: Hadoop HDFS Issue Type: Bug Components: name-node Affects Versions: 0.23.2 Reporter: Aaron T. Myers Assignee: Aaron T. Myers Fix For: 0.23.3 Attachments: HDFS-3099.patch, HDFS-3099.patch, HDFS-3099.patch The SecondaryNameNode is not properly initializing its metrics system. This results in the UgiMetrics, Metrics subsystem stats, and JvmMetrics not being output. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3077) Quorum-based protocol for reading and writing edit logs
[ https://issues.apache.org/jira/browse/HDFS-3077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230629#comment-13230629 ] Suresh Srinivas commented on HDFS-3077: --- bq. but like Einstein said, no simpler! Its all relative :-) BTW it would be good write design for this. That avoid lenghty comments and keeps the summary of what is proposed in place, instead of scattering in multiple comments. bq. This is mostly great – so long as you have an external fencing strategy which prevents the old active from attempting to continue to write after the new active is trying to read. External fencing is not needed, given active daemons having ability to fence. bq. it gets the loggers to promise not to accept edits from the old active The daemons can stop accepting writes when it realizes that active lock is no longer held by the writer. Clearly an advantage of an active daemon compared to using passive storage. bq. But, we still have one more problem: given some txid N, we might have multiple actives that have tried to write the same transaction ID. Example scenario: The case of writes making it though some daemons can also be solved. The writes that have made through W daemons wins. The others are marked not in sync and need to sync up. Explanation to follow. The solution we are building is specific to namenode editlogs. There is only one active writer (as Ivan brought up earlier). Here is the outline I am thinking of. Lets start with steady state with K of N journal deamons. When a journal daemon fails, we roll the edits. When a journal daemon joins, we roll the edits. New journal daemon could start syncing other finalized edits, while keeping track of edits in progress. We also keep track of the list of the active daemons in zookeeper. Rolling gives a logical point for newly joined daemon to sync up (sort of like generation stamp). During failover, the new active, gets from the actively written journals, the point to which it has to sync up to. It then rolls the edits also to that point. Rolling also gives you a way to discard extra journal records that made it to W daemons, during failover. When there are overlapping records, say e1-105 and e100-200, you read 100-105 from the second editlog, and discard it from the first editlog. Again there are scenarios that are missing here. I plan to post more details in a design on this. Quorum-based protocol for reading and writing edit logs --- Key: HDFS-3077 URL: https://issues.apache.org/jira/browse/HDFS-3077 Project: Hadoop HDFS Issue Type: New Feature Components: ha, name-node Reporter: Todd Lipcon Assignee: Todd Lipcon Currently, one of the weak points of the HA design is that it relies on shared storage such as an NFS filer for the shared edit log. One alternative that has been proposed is to depend on BookKeeper, a ZooKeeper subproject which provides a highly available replicated edit log on commodity hardware. This JIRA is to implement another alternative, based on a quorum commit protocol, integrated more tightly in HDFS and with the requirements driven only by HDFS's needs rather than more generic use cases. More details to follow. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3077) Quorum-based protocol for reading and writing edit logs
[ https://issues.apache.org/jira/browse/HDFS-3077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230634#comment-13230634 ] Todd Lipcon commented on HDFS-3077: --- bq. The daemons can stop accepting writes when it realizes that active lock is no longer held by the writer. Clearly an advantage of an active daemon compared to using passive storage. Relying on ZK here is insufficient - the actual protocol itself needs fencing to guarantee that a quorum of loggers have seen the lost lock before the new writer starts writing. I agree with your later comments that rolling the edits is a helpful construct here, but you need to also make sure there's consensus on the active writer when beginning a new log segment. I'm about halfway done with a prototype implementation of this, I should have something to show by middle of next week. At that point I'll also post a more thorough explanation of the design. Quorum-based protocol for reading and writing edit logs --- Key: HDFS-3077 URL: https://issues.apache.org/jira/browse/HDFS-3077 Project: Hadoop HDFS Issue Type: New Feature Components: ha, name-node Reporter: Todd Lipcon Assignee: Todd Lipcon Currently, one of the weak points of the HA design is that it relies on shared storage such as an NFS filer for the shared edit log. One alternative that has been proposed is to depend on BookKeeper, a ZooKeeper subproject which provides a highly available replicated edit log on commodity hardware. This JIRA is to implement another alternative, based on a quorum commit protocol, integrated more tightly in HDFS and with the requirements driven only by HDFS's needs rather than more generic use cases. More details to follow. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3099) SecondaryNameNode does not properly initialize metrics system
[ https://issues.apache.org/jira/browse/HDFS-3099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230641#comment-13230641 ] Hudson commented on HDFS-3099: -- Integrated in Hadoop-Common-0.23-Commit #685 (See [https://builds.apache.org/job/Hadoop-Common-0.23-Commit/685/]) HDFS-3099. SecondaryNameNode does not properly initialize metrics system. Contributed by Aaron T. Myers. (Revision 1301230) Result = SUCCESS atm : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1301230 Files : * /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/SecondaryNameNode.java * /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestSecondaryWebUi.java SecondaryNameNode does not properly initialize metrics system - Key: HDFS-3099 URL: https://issues.apache.org/jira/browse/HDFS-3099 Project: Hadoop HDFS Issue Type: Bug Components: name-node Affects Versions: 0.23.2 Reporter: Aaron T. Myers Assignee: Aaron T. Myers Fix For: 0.23.3 Attachments: HDFS-3099.patch, HDFS-3099.patch, HDFS-3099.patch The SecondaryNameNode is not properly initializing its metrics system. This results in the UgiMetrics, Metrics subsystem stats, and JvmMetrics not being output. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3099) SecondaryNameNode does not properly initialize metrics system
[ https://issues.apache.org/jira/browse/HDFS-3099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230646#comment-13230646 ] Hudson commented on HDFS-3099: -- Integrated in Hadoop-Hdfs-0.23-Commit #676 (See [https://builds.apache.org/job/Hadoop-Hdfs-0.23-Commit/676/]) HDFS-3099. SecondaryNameNode does not properly initialize metrics system. Contributed by Aaron T. Myers. (Revision 1301230) Result = SUCCESS atm : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1301230 Files : * /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/SecondaryNameNode.java * /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestSecondaryWebUi.java SecondaryNameNode does not properly initialize metrics system - Key: HDFS-3099 URL: https://issues.apache.org/jira/browse/HDFS-3099 Project: Hadoop HDFS Issue Type: Bug Components: name-node Affects Versions: 0.23.2 Reporter: Aaron T. Myers Assignee: Aaron T. Myers Fix For: 0.23.3 Attachments: HDFS-3099.patch, HDFS-3099.patch, HDFS-3099.patch The SecondaryNameNode is not properly initializing its metrics system. This results in the UgiMetrics, Metrics subsystem stats, and JvmMetrics not being output. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HDFS-3105) Add DatanodeStorage information to block recovery
Add DatanodeStorage information to block recovery - Key: HDFS-3105 URL: https://issues.apache.org/jira/browse/HDFS-3105 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Tsz Wo (Nicholas), SZE Assignee: Tsz Wo (Nicholas), SZE -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3099) SecondaryNameNode does not properly initialize metrics system
[ https://issues.apache.org/jira/browse/HDFS-3099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230666#comment-13230666 ] Hudson commented on HDFS-3099: -- Integrated in Hadoop-Mapreduce-trunk-Commit #1889 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/1889/]) HDFS-3099. SecondaryNameNode does not properly initialize metrics system. Contributed by Aaron T. Myers. (Revision 1301222) Result = ABORTED atm : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1301222 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/SecondaryNameNode.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestSecondaryWebUi.java SecondaryNameNode does not properly initialize metrics system - Key: HDFS-3099 URL: https://issues.apache.org/jira/browse/HDFS-3099 Project: Hadoop HDFS Issue Type: Bug Components: name-node Affects Versions: 0.23.2 Reporter: Aaron T. Myers Assignee: Aaron T. Myers Fix For: 0.23.3 Attachments: HDFS-3099.patch, HDFS-3099.patch, HDFS-3099.patch The SecondaryNameNode is not properly initializing its metrics system. This results in the UgiMetrics, Metrics subsystem stats, and JvmMetrics not being output. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3105) Add DatanodeStorage information to block recovery
[ https://issues.apache.org/jira/browse/HDFS-3105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo (Nicholas), SZE updated HDFS-3105: - Component/s: hdfs client data-node Description: When recovering a block, the namenode and client do not have the datanode storage information of the block. So namenode cannot add the block to the corresponding datanode storge block list. Add DatanodeStorage information to block recovery - Key: HDFS-3105 URL: https://issues.apache.org/jira/browse/HDFS-3105 Project: Hadoop HDFS Issue Type: Sub-task Components: data-node, hdfs client Reporter: Tsz Wo (Nicholas), SZE Assignee: Tsz Wo (Nicholas), SZE When recovering a block, the namenode and client do not have the datanode storage information of the block. So namenode cannot add the block to the corresponding datanode storge block list. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3062) Fail to submit mapred job on a secured-HA-HDFS: logic URI cannot be picked up by job submission.
[ https://issues.apache.org/jira/browse/HDFS-3062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230672#comment-13230672 ] Hadoop QA commented on HDFS-3062: - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12517984/HDFS-3062-trunk-2.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 eclipse:eclipse. The patch built with eclipse:eclipse. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.cli.TestHDFSCLI +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/2020//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2020//console This message is automatically generated. Fail to submit mapred job on a secured-HA-HDFS: logic URI cannot be picked up by job submission. Key: HDFS-3062 URL: https://issues.apache.org/jira/browse/HDFS-3062 Project: Hadoop HDFS Issue Type: Bug Components: ha, security Affects Versions: 0.24.0 Reporter: Mingjie Lai Assignee: Mingjie Lai Priority: Critical Attachments: HDFS-3062-trunk-2.patch, HDFS-3062-trunk.patch When testing the combination of NN HA + security + yarn, I found that the mapred job submission cannot pick up the logic URI of a nameservice. I have logic URI configured in core-site.xml {code} property namefs.defaultFS/name valuehdfs://ns1/value /property {code} HDFS client can work with the HA deployment/configs: {code} [root@nn1 hadoop]# hdfs dfs -ls / Found 6 items drwxr-xr-x - hbase hadoop 0 2012-03-07 20:42 /hbase drwxrwxrwx - yarn hadoop 0 2012-03-07 20:42 /logs drwxr-xr-x - mapred hadoop 0 2012-03-07 20:42 /mapred drwxr-xr-x - mapred hadoop 0 2012-03-07 20:42 /mr-history drwxrwxrwt - hdfs hadoop 0 2012-03-07 21:57 /tmp drwxr-xr-x - hdfs hadoop 0 2012-03-07 20:42 /user {code} but cannot submit a mapred job with security turned on {code} [root@nn1 hadoop]# /usr/lib/hadoop/bin/yarn --config ./conf jar share/hadoop/mapreduce/hadoop-mapreduce-examples-0.24.0-SNAPSHOT.jar randomwriter out Running 0 maps. Job started: Wed Mar 07 23:28:23 UTC 2012 java.lang.IllegalArgumentException: java.net.UnknownHostException: ns1 at org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:431) at org.apache.hadoop.security.SecurityUtil.buildDTServiceName(SecurityUtil.java:312) at org.apache.hadoop.fs.FileSystem.getCanonicalServiceName(FileSystem.java:217) at org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodesInternal(TokenCache.java:119) at org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodesInternal(TokenCache.java:97) at org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodes(TokenCache.java:80) at org.apache.hadoop.mapreduce.lib.output.FileOutputFormat.checkOutputSpecs(FileOutputFormat.java:137) at org.apache.hadoop.mapreduce.JobSubmitter.checkSpecs(JobSubmitter.java:411) at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:326) at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1221) at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1218) {code}0.24 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3099) SecondaryNameNode does not properly initialize metrics system
[ https://issues.apache.org/jira/browse/HDFS-3099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230676#comment-13230676 ] Hudson commented on HDFS-3099: -- Integrated in Hadoop-Mapreduce-0.23-Commit #693 (See [https://builds.apache.org/job/Hadoop-Mapreduce-0.23-Commit/693/]) HDFS-3099. SecondaryNameNode does not properly initialize metrics system. Contributed by Aaron T. Myers. (Revision 1301230) Result = ABORTED atm : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1301230 Files : * /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/SecondaryNameNode.java * /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestSecondaryWebUi.java SecondaryNameNode does not properly initialize metrics system - Key: HDFS-3099 URL: https://issues.apache.org/jira/browse/HDFS-3099 Project: Hadoop HDFS Issue Type: Bug Components: name-node Affects Versions: 0.23.2 Reporter: Aaron T. Myers Assignee: Aaron T. Myers Fix For: 0.23.3 Attachments: HDFS-3099.patch, HDFS-3099.patch, HDFS-3099.patch The SecondaryNameNode is not properly initializing its metrics system. This results in the UgiMetrics, Metrics subsystem stats, and JvmMetrics not being output. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HDFS-3106) TestHDFSCLI fails on Test ls: Test for /*/* globbing
TestHDFSCLI fails on Test ls: Test for /*/* globbing --- Key: HDFS-3106 URL: https://issues.apache.org/jira/browse/HDFS-3106 Project: Hadoop HDFS Issue Type: Bug Components: test Affects Versions: 0.23.2 Reporter: Ravi Prakash This is the one and only test failure: 2012-03-15 18:06:42,068 INFO cli.CLITestHelper (CLITestHelper.java:displayResults(156)) - --- 2012-03-15 18:06:42,068 INFO cli.CLITestHelper (CLITestHelper.java:displayResults(157)) - Test ID: [30] 2012-03-15 18:06:42,068 INFO cli.CLITestHelper (CLITestHelper.java:displayResults(158)) -Test Description: [ls: Test for /*/* globbing ] 2012-03-15 18:06:42,068 INFO cli.CLITestHelper (CLITestHelper.java:displayResults(159)) - 2012-03-15 18:06:42,069 INFO cli.CLITestHelper (CLITestHelper.java:displayResults(163)) - Test Commands: [-fs hdfs://localhost.localdomain:32992 -mkdir /dir0] 2012-03-15 18:06:42,069 INFO cli.CLITestHelper (CLITestHelper.java:displayResults(163)) - Test Commands: [-fs hdfs://localhost.localdomain:32992 -mkdir /dir0/dir1] 2012-03-15 18:06:42,069 INFO cli.CLITestHelper (CLITestHelper.java:displayResults(163)) - Test Commands: [-fs hdfs://localhost.localdomain:32992 -touchz /dir0/dir1/file1] 2012-03-15 18:06:42,069 INFO cli.CLITestHelper (CLITestHelper.java:displayResults(163)) - Test Commands: [-fs hdfs://localhost.localdomain:32992 -ls -R /\*/\*] 2012-03-15 18:06:42,069 INFO cli.CLITestHelper (CLITestHelper.java:displayResults(167)) - 2012-03-15 18:06:42,069 INFO cli.CLITestHelper (CLITestHelper.java:displayResults(170)) -Cleanup Commands: [-fs hdfs://localhost.localdomain:32992 -rm -r /dir0] 2012-03-15 18:06:42,070 INFO cli.CLITestHelper (CLITestHelper.java:displayResults(174)) - 2012-03-15 18:06:42,070 INFO cli.CLITestHelper (CLITestHelper.java:displayResults(178)) - Comparator: [RegexpComparator] 2012-03-15 18:06:42,070 INFO cli.CLITestHelper (CLITestHelper.java:displayResults(180)) - Comparision result: [fail] 2012-03-15 18:06:42,070 INFO cli.CLITestHelper (CLITestHelper.java:displayResults(182)) - Expected output: [^-rw-r--r--( )*1( )*[a-z]*( )*supergroup( )*0( )*[0-9]{4,}-[0-9]{2,}-[0-9]{2,} [0-9]{2,}:[0-9]{2,}( )*/dir0/dir1/file1] 2012-03-15 18:06:42,070 INFO cli.CLITestHelper (CLITestHelper.java:displayResults(184)) - Actual output: [ls: `/\*/\*apos;: No such file or directory -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3094) add -nonInteractive and -force option to namenode -format command
[ https://issues.apache.org/jira/browse/HDFS-3094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Gupta updated HDFS-3094: -- Attachment: HDFS-3094.patch added more error checking for invalid clusterid options and added tests for them add -nonInteractive and -force option to namenode -format command - Key: HDFS-3094 URL: https://issues.apache.org/jira/browse/HDFS-3094 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 0.24.0, 1.0.2 Reporter: Arpit Gupta Assignee: Arpit Gupta Attachments: HDFS-3094.branch-1.0.patch, HDFS-3094.branch-1.0.patch, HDFS-3094.branch-1.0.patch, HDFS-3094.branch-1.0.patch, HDFS-3094.docs.patch, HDFS-3094.patch, HDFS-3094.patch Currently the bin/hadoop namenode -format prompts the user for a Y/N to setup the directories in the local file system. -force : namenode formats the directories without prompting -nonInterActive : namenode format will return with an exit code of 1 if the dir exists. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3104) Add tests for mkdir -p
[ https://issues.apache.org/jira/browse/HDFS-3104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230703#comment-13230703 ] Hadoop QA commented on HDFS-3104: - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12518549/HDFS-3104.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 8 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 eclipse:eclipse. The patch built with eclipse:eclipse. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.cli.TestHDFSCLI +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/2021//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2021//console This message is automatically generated. Add tests for mkdir -p -- Key: HDFS-3104 URL: https://issues.apache.org/jira/browse/HDFS-3104 Project: Hadoop HDFS Issue Type: Test Components: test Affects Versions: 0.24.0, 0.23.2 Reporter: Daryn Sharp Assignee: Daryn Sharp Attachments: HDFS-3104.patch Add tests for HADOOP-8175. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3062) Fail to submit mapred job on a secured-HA-HDFS: logic URI cannot be picked up by job submission.
[ https://issues.apache.org/jira/browse/HDFS-3062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230715#comment-13230715 ] Mingjie Lai commented on HDFS-3062: --- The test error is reported at HDFS-3106. It's not caused by the patch here. Fail to submit mapred job on a secured-HA-HDFS: logic URI cannot be picked up by job submission. Key: HDFS-3062 URL: https://issues.apache.org/jira/browse/HDFS-3062 Project: Hadoop HDFS Issue Type: Bug Components: ha, security Affects Versions: 0.24.0 Reporter: Mingjie Lai Assignee: Mingjie Lai Priority: Critical Attachments: HDFS-3062-trunk-2.patch, HDFS-3062-trunk.patch When testing the combination of NN HA + security + yarn, I found that the mapred job submission cannot pick up the logic URI of a nameservice. I have logic URI configured in core-site.xml {code} property namefs.defaultFS/name valuehdfs://ns1/value /property {code} HDFS client can work with the HA deployment/configs: {code} [root@nn1 hadoop]# hdfs dfs -ls / Found 6 items drwxr-xr-x - hbase hadoop 0 2012-03-07 20:42 /hbase drwxrwxrwx - yarn hadoop 0 2012-03-07 20:42 /logs drwxr-xr-x - mapred hadoop 0 2012-03-07 20:42 /mapred drwxr-xr-x - mapred hadoop 0 2012-03-07 20:42 /mr-history drwxrwxrwt - hdfs hadoop 0 2012-03-07 21:57 /tmp drwxr-xr-x - hdfs hadoop 0 2012-03-07 20:42 /user {code} but cannot submit a mapred job with security turned on {code} [root@nn1 hadoop]# /usr/lib/hadoop/bin/yarn --config ./conf jar share/hadoop/mapreduce/hadoop-mapreduce-examples-0.24.0-SNAPSHOT.jar randomwriter out Running 0 maps. Job started: Wed Mar 07 23:28:23 UTC 2012 java.lang.IllegalArgumentException: java.net.UnknownHostException: ns1 at org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:431) at org.apache.hadoop.security.SecurityUtil.buildDTServiceName(SecurityUtil.java:312) at org.apache.hadoop.fs.FileSystem.getCanonicalServiceName(FileSystem.java:217) at org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodesInternal(TokenCache.java:119) at org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodesInternal(TokenCache.java:97) at org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodes(TokenCache.java:80) at org.apache.hadoop.mapreduce.lib.output.FileOutputFormat.checkOutputSpecs(FileOutputFormat.java:137) at org.apache.hadoop.mapreduce.JobSubmitter.checkSpecs(JobSubmitter.java:411) at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:326) at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1221) at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1218) {code}0.24 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3062) Fail to submit mapred job on a secured-HA-HDFS: logic URI cannot be picked up by job submission.
[ https://issues.apache.org/jira/browse/HDFS-3062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230727#comment-13230727 ] Todd Lipcon commented on HDFS-3062: --- +1, will commit momentarily. Thanks for fixing this, Mingjie. Fail to submit mapred job on a secured-HA-HDFS: logic URI cannot be picked up by job submission. Key: HDFS-3062 URL: https://issues.apache.org/jira/browse/HDFS-3062 Project: Hadoop HDFS Issue Type: Bug Components: ha, security Affects Versions: 0.24.0 Reporter: Mingjie Lai Assignee: Mingjie Lai Priority: Critical Attachments: HDFS-3062-trunk-2.patch, HDFS-3062-trunk.patch When testing the combination of NN HA + security + yarn, I found that the mapred job submission cannot pick up the logic URI of a nameservice. I have logic URI configured in core-site.xml {code} property namefs.defaultFS/name valuehdfs://ns1/value /property {code} HDFS client can work with the HA deployment/configs: {code} [root@nn1 hadoop]# hdfs dfs -ls / Found 6 items drwxr-xr-x - hbase hadoop 0 2012-03-07 20:42 /hbase drwxrwxrwx - yarn hadoop 0 2012-03-07 20:42 /logs drwxr-xr-x - mapred hadoop 0 2012-03-07 20:42 /mapred drwxr-xr-x - mapred hadoop 0 2012-03-07 20:42 /mr-history drwxrwxrwt - hdfs hadoop 0 2012-03-07 21:57 /tmp drwxr-xr-x - hdfs hadoop 0 2012-03-07 20:42 /user {code} but cannot submit a mapred job with security turned on {code} [root@nn1 hadoop]# /usr/lib/hadoop/bin/yarn --config ./conf jar share/hadoop/mapreduce/hadoop-mapreduce-examples-0.24.0-SNAPSHOT.jar randomwriter out Running 0 maps. Job started: Wed Mar 07 23:28:23 UTC 2012 java.lang.IllegalArgumentException: java.net.UnknownHostException: ns1 at org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:431) at org.apache.hadoop.security.SecurityUtil.buildDTServiceName(SecurityUtil.java:312) at org.apache.hadoop.fs.FileSystem.getCanonicalServiceName(FileSystem.java:217) at org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodesInternal(TokenCache.java:119) at org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodesInternal(TokenCache.java:97) at org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodes(TokenCache.java:80) at org.apache.hadoop.mapreduce.lib.output.FileOutputFormat.checkOutputSpecs(FileOutputFormat.java:137) at org.apache.hadoop.mapreduce.JobSubmitter.checkSpecs(JobSubmitter.java:411) at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:326) at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1221) at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1218) {code}0.24 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3062) Fail to submit mapred job on a secured-HA-HDFS: logic URI cannot be picked up by job submission.
[ https://issues.apache.org/jira/browse/HDFS-3062?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Todd Lipcon updated HDFS-3062: -- Resolution: Fixed Fix Version/s: 0.23.3 0.24.0 Target Version/s: 0.24.0, 0.23.3 (was: 0.23.3, 0.24.0) Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) Committed to branch-23 and trunk, thanks Mingjie Fail to submit mapred job on a secured-HA-HDFS: logic URI cannot be picked up by job submission. Key: HDFS-3062 URL: https://issues.apache.org/jira/browse/HDFS-3062 Project: Hadoop HDFS Issue Type: Bug Components: ha, security Affects Versions: 0.24.0 Reporter: Mingjie Lai Assignee: Mingjie Lai Priority: Critical Fix For: 0.24.0, 0.23.3 Attachments: HDFS-3062-trunk-2.patch, HDFS-3062-trunk.patch When testing the combination of NN HA + security + yarn, I found that the mapred job submission cannot pick up the logic URI of a nameservice. I have logic URI configured in core-site.xml {code} property namefs.defaultFS/name valuehdfs://ns1/value /property {code} HDFS client can work with the HA deployment/configs: {code} [root@nn1 hadoop]# hdfs dfs -ls / Found 6 items drwxr-xr-x - hbase hadoop 0 2012-03-07 20:42 /hbase drwxrwxrwx - yarn hadoop 0 2012-03-07 20:42 /logs drwxr-xr-x - mapred hadoop 0 2012-03-07 20:42 /mapred drwxr-xr-x - mapred hadoop 0 2012-03-07 20:42 /mr-history drwxrwxrwt - hdfs hadoop 0 2012-03-07 21:57 /tmp drwxr-xr-x - hdfs hadoop 0 2012-03-07 20:42 /user {code} but cannot submit a mapred job with security turned on {code} [root@nn1 hadoop]# /usr/lib/hadoop/bin/yarn --config ./conf jar share/hadoop/mapreduce/hadoop-mapreduce-examples-0.24.0-SNAPSHOT.jar randomwriter out Running 0 maps. Job started: Wed Mar 07 23:28:23 UTC 2012 java.lang.IllegalArgumentException: java.net.UnknownHostException: ns1 at org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:431) at org.apache.hadoop.security.SecurityUtil.buildDTServiceName(SecurityUtil.java:312) at org.apache.hadoop.fs.FileSystem.getCanonicalServiceName(FileSystem.java:217) at org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodesInternal(TokenCache.java:119) at org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodesInternal(TokenCache.java:97) at org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodes(TokenCache.java:80) at org.apache.hadoop.mapreduce.lib.output.FileOutputFormat.checkOutputSpecs(FileOutputFormat.java:137) at org.apache.hadoop.mapreduce.JobSubmitter.checkSpecs(JobSubmitter.java:411) at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:326) at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1221) at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1218) {code}0.24 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3098) Update FsShell tests for quoted metachars
[ https://issues.apache.org/jira/browse/HDFS-3098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230733#comment-13230733 ] Hadoop QA commented on HDFS-3098: - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12518522/HDFS-3098.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 8 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 eclipse:eclipse. The patch built with eclipse:eclipse. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed unit tests in . +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/2022//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2022//console This message is automatically generated. Update FsShell tests for quoted metachars - Key: HDFS-3098 URL: https://issues.apache.org/jira/browse/HDFS-3098 Project: Hadoop HDFS Issue Type: Test Components: test Affects Versions: 0.24.0, 0.23.2 Reporter: Daryn Sharp Assignee: Daryn Sharp Attachments: HDFS-3098.patch Need to add tests to TestDFSShell for quoted metachars. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3101) cannot read empty file using webhdfs
[ https://issues.apache.org/jira/browse/HDFS-3101?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo (Nicholas), SZE updated HDFS-3101: - Attachment: h3101_20120315_branch-1.patch h3101_20120315_branch-1.patch: for branch-1. cannot read empty file using webhdfs Key: HDFS-3101 URL: https://issues.apache.org/jira/browse/HDFS-3101 Project: Hadoop HDFS Issue Type: Bug Components: hdfs client Affects Versions: 0.23.1 Reporter: Zhanwei.Wang Assignee: Tsz Wo (Nicholas), SZE Attachments: h3101_20120315.patch, h3101_20120315_branch-1.patch STEP: 1, create a new EMPTY file 2, read it using webhdfs. RESULT: expected: get a empty file I got: {RemoteException:{exception:IOException,javaClassName:java.io.IOException,message:Offset=0 out of the range [0, 0); OPEN, path=/testFile}} First of all, [0, 0) is not a valid range, and I think read a empty file should be OK. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3101) cannot read empty file using webhdfs
[ https://issues.apache.org/jira/browse/HDFS-3101?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo (Nicholas), SZE updated HDFS-3101: - Resolution: Fixed Fix Version/s: 0.23.3 1.0.2 0.23.2 1.1.0 0.24.0 Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) Thanks Daryn for the review. I have committed this. cannot read empty file using webhdfs Key: HDFS-3101 URL: https://issues.apache.org/jira/browse/HDFS-3101 Project: Hadoop HDFS Issue Type: Bug Components: hdfs client Affects Versions: 0.23.1 Reporter: Zhanwei.Wang Assignee: Tsz Wo (Nicholas), SZE Fix For: 0.24.0, 1.1.0, 0.23.2, 1.0.2, 0.23.3 Attachments: h3101_20120315.patch, h3101_20120315_branch-1.patch STEP: 1, create a new EMPTY file 2, read it using webhdfs. RESULT: expected: get a empty file I got: {RemoteException:{exception:IOException,javaClassName:java.io.IOException,message:Offset=0 out of the range [0, 0); OPEN, path=/testFile}} First of all, [0, 0) is not a valid range, and I think read a empty file should be OK. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3062) Fail to submit mapred job on a secured-HA-HDFS: logic URI cannot be picked up by job submission.
[ https://issues.apache.org/jira/browse/HDFS-3062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230749#comment-13230749 ] Hudson commented on HDFS-3062: -- Integrated in Hadoop-Common-0.23-Commit #687 (See [https://builds.apache.org/job/Hadoop-Common-0.23-Commit/687/]) HDFS-3062. Fix bug which prevented MR job submission from creating delegation tokens on an HA cluster. Contributed by Mingjie Lai. (Revision 1301286) Result = SUCCESS todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1301286 Files : * /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DistributedFileSystem.java * /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/ha/TestDelegationTokensWithHA.java Fail to submit mapred job on a secured-HA-HDFS: logic URI cannot be picked up by job submission. Key: HDFS-3062 URL: https://issues.apache.org/jira/browse/HDFS-3062 Project: Hadoop HDFS Issue Type: Bug Components: ha, security Affects Versions: 0.24.0 Reporter: Mingjie Lai Assignee: Mingjie Lai Priority: Critical Fix For: 0.24.0, 0.23.3 Attachments: HDFS-3062-trunk-2.patch, HDFS-3062-trunk.patch When testing the combination of NN HA + security + yarn, I found that the mapred job submission cannot pick up the logic URI of a nameservice. I have logic URI configured in core-site.xml {code} property namefs.defaultFS/name valuehdfs://ns1/value /property {code} HDFS client can work with the HA deployment/configs: {code} [root@nn1 hadoop]# hdfs dfs -ls / Found 6 items drwxr-xr-x - hbase hadoop 0 2012-03-07 20:42 /hbase drwxrwxrwx - yarn hadoop 0 2012-03-07 20:42 /logs drwxr-xr-x - mapred hadoop 0 2012-03-07 20:42 /mapred drwxr-xr-x - mapred hadoop 0 2012-03-07 20:42 /mr-history drwxrwxrwt - hdfs hadoop 0 2012-03-07 21:57 /tmp drwxr-xr-x - hdfs hadoop 0 2012-03-07 20:42 /user {code} but cannot submit a mapred job with security turned on {code} [root@nn1 hadoop]# /usr/lib/hadoop/bin/yarn --config ./conf jar share/hadoop/mapreduce/hadoop-mapreduce-examples-0.24.0-SNAPSHOT.jar randomwriter out Running 0 maps. Job started: Wed Mar 07 23:28:23 UTC 2012 java.lang.IllegalArgumentException: java.net.UnknownHostException: ns1 at org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:431) at org.apache.hadoop.security.SecurityUtil.buildDTServiceName(SecurityUtil.java:312) at org.apache.hadoop.fs.FileSystem.getCanonicalServiceName(FileSystem.java:217) at org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodesInternal(TokenCache.java:119) at org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodesInternal(TokenCache.java:97) at org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodes(TokenCache.java:80) at org.apache.hadoop.mapreduce.lib.output.FileOutputFormat.checkOutputSpecs(FileOutputFormat.java:137) at org.apache.hadoop.mapreduce.JobSubmitter.checkSpecs(JobSubmitter.java:411) at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:326) at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1221) at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1218) {code}0.24 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3101) cannot read empty file using webhdfs
[ https://issues.apache.org/jira/browse/HDFS-3101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230750#comment-13230750 ] Hudson commented on HDFS-3101: -- Integrated in Hadoop-Common-0.23-Commit #687 (See [https://builds.apache.org/job/Hadoop-Common-0.23-Commit/687/]) svn merge -c 1301287 from trunk for HDFS-3101. (Revision 1301288) Result = SUCCESS szetszwo : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1301288 Files : * /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs * /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java * /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/web/resources/NamenodeWebHdfsMethods.java * /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/web/TestWebHdfsFileSystemContract.java cannot read empty file using webhdfs Key: HDFS-3101 URL: https://issues.apache.org/jira/browse/HDFS-3101 Project: Hadoop HDFS Issue Type: Bug Components: hdfs client Affects Versions: 0.23.1 Reporter: Zhanwei.Wang Assignee: Tsz Wo (Nicholas), SZE Fix For: 0.24.0, 1.1.0, 0.23.2, 1.0.2, 0.23.3 Attachments: h3101_20120315.patch, h3101_20120315_branch-1.patch STEP: 1, create a new EMPTY file 2, read it using webhdfs. RESULT: expected: get a empty file I got: {RemoteException:{exception:IOException,javaClassName:java.io.IOException,message:Offset=0 out of the range [0, 0); OPEN, path=/testFile}} First of all, [0, 0) is not a valid range, and I think read a empty file should be OK. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3098) Update FsShell tests for quoted metachars
[ https://issues.apache.org/jira/browse/HDFS-3098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230751#comment-13230751 ] Tsz Wo (Nicholas), SZE commented on HDFS-3098: -- +1 That's great. The build is back to stable. Update FsShell tests for quoted metachars - Key: HDFS-3098 URL: https://issues.apache.org/jira/browse/HDFS-3098 Project: Hadoop HDFS Issue Type: Test Components: test Affects Versions: 0.24.0, 0.23.2 Reporter: Daryn Sharp Assignee: Daryn Sharp Attachments: HDFS-3098.patch Need to add tests to TestDFSShell for quoted metachars. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3101) cannot read empty file using webhdfs
[ https://issues.apache.org/jira/browse/HDFS-3101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230758#comment-13230758 ] Hudson commented on HDFS-3101: -- Integrated in Hadoop-Common-trunk-Commit #1883 (See [https://builds.apache.org/job/Hadoop-Common-trunk-Commit/1883/]) HDFS-3101. Cannot read empty file using WebHDFS. (Revision 1301287) Result = SUCCESS szetszwo : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1301287 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/web/resources/NamenodeWebHdfsMethods.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/web/TestWebHdfsFileSystemContract.java cannot read empty file using webhdfs Key: HDFS-3101 URL: https://issues.apache.org/jira/browse/HDFS-3101 Project: Hadoop HDFS Issue Type: Bug Components: hdfs client Affects Versions: 0.23.1 Reporter: Zhanwei.Wang Assignee: Tsz Wo (Nicholas), SZE Fix For: 0.24.0, 1.1.0, 0.23.2, 1.0.2, 0.23.3 Attachments: h3101_20120315.patch, h3101_20120315_branch-1.patch STEP: 1, create a new EMPTY file 2, read it using webhdfs. RESULT: expected: get a empty file I got: {RemoteException:{exception:IOException,javaClassName:java.io.IOException,message:Offset=0 out of the range [0, 0); OPEN, path=/testFile}} First of all, [0, 0) is not a valid range, and I think read a empty file should be OK. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira