[jira] [Commented] (HDFS-1957) Documentation for HFTP
[ https://issues.apache.org/jira/browse/HDFS-1957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13036000#comment-13036000 ] Hadoop QA commented on HDFS-1957: - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12479719/HDFS-1957.patch against trunk revision 1124459. +1 @author. The patch does not contain any @author tags. +0 tests included. The patch appears to be a documentation patch that doesn't require tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these core unit tests: org.apache.hadoop.hdfs.TestDFSStorageStateRecovery org.apache.hadoop.hdfs.TestFileConcurrentReader org.apache.hadoop.tools.TestJMXGet +1 contrib tests. The patch passed contrib unit tests. +1 system test framework. The patch passed system test framework compile. Test results: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/580//testReport/ Findbugs warnings: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/580//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/580//console This message is automatically generated. Documentation for HFTP -- Key: HDFS-1957 URL: https://issues.apache.org/jira/browse/HDFS-1957 Project: Hadoop HDFS Issue Type: Improvement Components: documentation Affects Versions: 0.23.0 Reporter: Ari Rabkin Assignee: Ari Rabkin Priority: Minor Fix For: 0.23.0 Attachments: HDFS-1957.patch, HDFS-1957.patch, HDFS-1957.patch There should be some documentation for HFTP. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-1961) New architectural documentation created
[ https://issues.apache.org/jira/browse/HDFS-1961?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rick Kazman updated HDFS-1961: -- Attachment: HDFS ArchDoc.Jira.docx This is a Word version of the architecture documentation. The HTML version can be found at: http://kazman.shidler.hawaii.edu/ArchDoc.html New architectural documentation created --- Key: HDFS-1961 URL: https://issues.apache.org/jira/browse/HDFS-1961 Project: Hadoop HDFS Issue Type: Improvement Components: documentation Affects Versions: 0.21.0 Reporter: Rick Kazman Labels: architecture, hadoop, newbie Fix For: 0.21.0 Attachments: HDFS ArchDoc.Jira.docx This material provides an overview of the HDFS architecture and is intended for contributors. The goal of this document is to provide a guide to the overall structure of the HDFS code so that contributors can more effectively understand how changes that they are considering can be made, and the consequences of those changes. The assumption is that the reader has a basic understanding of HDFS, its purpose, and how it fits into the Hadoop project suite. An HTML version of the architectural documentation can be found at: http://kazman.shidler.hawaii.edu/ArchDoc.html All comments and suggestions for improvements are appreciated. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1961) New architectural documentation created
[ https://issues.apache.org/jira/browse/HDFS-1961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13036003#comment-13036003 ] Rick Kazman commented on HDFS-1961: --- We expect to be making periodic updates to this document. Our first update task is to add sequence diagrams to section 6. New architectural documentation created --- Key: HDFS-1961 URL: https://issues.apache.org/jira/browse/HDFS-1961 Project: Hadoop HDFS Issue Type: Improvement Components: documentation Affects Versions: 0.21.0 Reporter: Rick Kazman Labels: architecture, hadoop, newbie Fix For: 0.21.0 Attachments: HDFS ArchDoc.Jira.docx This material provides an overview of the HDFS architecture and is intended for contributors. The goal of this document is to provide a guide to the overall structure of the HDFS code so that contributors can more effectively understand how changes that they are considering can be made, and the consequences of those changes. The assumption is that the reader has a basic understanding of HDFS, its purpose, and how it fits into the Hadoop project suite. An HTML version of the architectural documentation can be found at: http://kazman.shidler.hawaii.edu/ArchDoc.html All comments and suggestions for improvements are appreciated. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-1575) viewing block from web UI broken
[ https://issues.apache.org/jira/browse/HDFS-1575?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Todd Lipcon updated HDFS-1575: -- Attachment: hdfs-1575-trunk.3.patch Hi Aaron. I noticed some opportunity for cleanup here: - removed unused import of org.mortbay.Log - cleaned up the JSP so there isn't duplicated code - the case that blks == null or blks.size() == 0 and security was off was being handled incorrectly - fixed a line that was super long Can you take a look at my changes? viewing block from web UI broken Key: HDFS-1575 URL: https://issues.apache.org/jira/browse/HDFS-1575 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 0.22.0 Reporter: Todd Lipcon Assignee: Aaron T. Myers Priority: Blocker Fix For: 0.22.0 Attachments: HDFS-1575, hdfs-1575-trunk.0.patch, hdfs-1575-trunk.1.patch, hdfs-1575-trunk.2.patch, hdfs-1575-trunk.3.patch DatanodeJspHelper seems to expect the file path to be in the path info of the HttpRequest, rather than in a parameter. I see the following exception when visiting the URL {{http://localhost.localdomain:50075/browseBlock.jsp?blockId=5006108823351810567blockSize=20genstamp=1001filename=%2Fuser%2Ftodd%2FissuedatanodePort=50010namenodeInfoPort=50070}} java.io.FileNotFoundException: File does not exist: / at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsInternal(FSNamesystem.java:834) ... at org.apache.hadoop.hdfs.server.datanode.DatanodeJspHelper.generateFileDetails(DatanodeJspHelper.java:258) at org.apache.hadoop.hdfs.server.datanode.browseBlock_jsp._jspService(browseBlock_jsp.java:79) at org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:97) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-1905) Improve the usability of namenode -format
[ https://issues.apache.org/jira/browse/HDFS-1905?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Suresh Srinivas updated HDFS-1905: -- Hadoop Flags: [Reviewed] Status: Patch Available (was: Open) Improve the usability of namenode -format -- Key: HDFS-1905 URL: https://issues.apache.org/jira/browse/HDFS-1905 Project: Hadoop HDFS Issue Type: Bug Components: name-node Affects Versions: 0.23.0 Reporter: Bharath Mundlapudi Assignee: Bharath Mundlapudi Priority: Minor Fix For: 0.23.0 Attachments: HDFS-1905-1.patch, HDFS-1905-2.patch While setting up 0.23 version based cluster, i ran into this issue. When i issue a format namenode command, which got changed in 23, it should let the user know to how to use this command in case where complete options were not specified. ./hdfs namenode -format I get the following error msg, still its not clear what and how user should use this command. 11/05/09 15:36:25 ERROR namenode.NameNode: java.lang.IllegalArgumentException: Format must be provided with clusterid at org.apache.hadoop.hdfs.server.namenode.NameNode.format(NameNode.java:1483) at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1623) at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1689) The usability of this command can be improved. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1575) viewing block from web UI broken
[ https://issues.apache.org/jira/browse/HDFS-1575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13036014#comment-13036014 ] Aaron T. Myers commented on HDFS-1575: -- +1, looks good to me. I especially like the reworking of the loop which iterates over {{blks}}. Thanks for cleaning this up. viewing block from web UI broken Key: HDFS-1575 URL: https://issues.apache.org/jira/browse/HDFS-1575 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 0.22.0 Reporter: Todd Lipcon Assignee: Aaron T. Myers Priority: Blocker Fix For: 0.22.0 Attachments: HDFS-1575, hdfs-1575-trunk.0.patch, hdfs-1575-trunk.1.patch, hdfs-1575-trunk.2.patch, hdfs-1575-trunk.3.patch DatanodeJspHelper seems to expect the file path to be in the path info of the HttpRequest, rather than in a parameter. I see the following exception when visiting the URL {{http://localhost.localdomain:50075/browseBlock.jsp?blockId=5006108823351810567blockSize=20genstamp=1001filename=%2Fuser%2Ftodd%2FissuedatanodePort=50010namenodeInfoPort=50070}} java.io.FileNotFoundException: File does not exist: / at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsInternal(FSNamesystem.java:834) ... at org.apache.hadoop.hdfs.server.datanode.DatanodeJspHelper.generateFileDetails(DatanodeJspHelper.java:258) at org.apache.hadoop.hdfs.server.datanode.browseBlock_jsp._jspService(browseBlock_jsp.java:79) at org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:97) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1922) Recurring failure in TestJMXGet.testNameNode since build 477 on May 11
[ https://issues.apache.org/jira/browse/HDFS-1922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13036016#comment-13036016 ] Todd Lipcon commented on HDFS-1922: --- It seems like it would be straight-forward to have a missing .properties file act like the default one that we check into conf/ (ie FileSink). That would make it less of an incompatible change, right? Recurring failure in TestJMXGet.testNameNode since build 477 on May 11 -- Key: HDFS-1922 URL: https://issues.apache.org/jira/browse/HDFS-1922 Project: Hadoop HDFS Issue Type: Sub-task Components: test Reporter: Matt Foley Assignee: Luke Lu Fix For: 0.23.0 Attachments: hdfs-1922-conf-v1.patch -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-1013) Miscellaneous improvements to HTML markup for web UIs
[ https://issues.apache.org/jira/browse/HDFS-1013?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Todd Lipcon updated HDFS-1013: -- Status: Open (was: Patch Available) Miscellaneous improvements to HTML markup for web UIs - Key: HDFS-1013 URL: https://issues.apache.org/jira/browse/HDFS-1013 Project: Hadoop HDFS Issue Type: Improvement Reporter: Todd Lipcon Assignee: Eugene Koontz Priority: Minor Labels: newbie Fix For: 0.20.3 Attachments: HDFS-1013.patch The Web UIs have various bits of bad markup (eg missing head sections, some pages missing CSS links, inconsistent td vs th for table headings). We should fix this up. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1592) Datanode startup doesn't honor volumes.tolerated
[ https://issues.apache.org/jira/browse/HDFS-1592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13036024#comment-13036024 ] Hadoop QA commented on HDFS-1592: - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12479721/HDFS-1592-3.patch against trunk revision 1124459. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 5 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these core unit tests: org.apache.hadoop.hdfs.TestDFSRemove org.apache.hadoop.hdfs.TestDFSStorageStateRecovery org.apache.hadoop.hdfs.TestFileConcurrentReader org.apache.hadoop.tools.TestJMXGet +1 contrib tests. The patch passed contrib unit tests. +1 system test framework. The patch passed system test framework compile. Test results: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/581//testReport/ Findbugs warnings: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/581//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/581//console This message is automatically generated. Datanode startup doesn't honor volumes.tolerated - Key: HDFS-1592 URL: https://issues.apache.org/jira/browse/HDFS-1592 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 0.20.204.0 Reporter: Bharath Mundlapudi Assignee: Bharath Mundlapudi Fix For: 0.20.204.0, 0.23.0 Attachments: HDFS-1592-1.patch, HDFS-1592-2.patch, HDFS-1592-3.patch, HDFS-1592-rel20.patch Datanode startup doesn't honor volumes.tolerated for hadoop 20 version. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1575) viewing block from web UI broken
[ https://issues.apache.org/jira/browse/HDFS-1575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13036026#comment-13036026 ] Hadoop QA commented on HDFS-1575: - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12479722/hdfs-1575-trunk.2.patch against trunk revision 1124459. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 5 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these core unit tests: org.apache.hadoop.hdfs.TestDFSStorageStateRecovery org.apache.hadoop.hdfs.TestFileConcurrentReader org.apache.hadoop.tools.TestJMXGet +1 contrib tests. The patch passed contrib unit tests. +1 system test framework. The patch passed system test framework compile. Test results: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/582//testReport/ Findbugs warnings: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/582//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/582//console This message is automatically generated. viewing block from web UI broken Key: HDFS-1575 URL: https://issues.apache.org/jira/browse/HDFS-1575 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 0.22.0 Reporter: Todd Lipcon Assignee: Aaron T. Myers Priority: Blocker Fix For: 0.22.0 Attachments: HDFS-1575, hdfs-1575-trunk.0.patch, hdfs-1575-trunk.1.patch, hdfs-1575-trunk.2.patch, hdfs-1575-trunk.3.patch DatanodeJspHelper seems to expect the file path to be in the path info of the HttpRequest, rather than in a parameter. I see the following exception when visiting the URL {{http://localhost.localdomain:50075/browseBlock.jsp?blockId=5006108823351810567blockSize=20genstamp=1001filename=%2Fuser%2Ftodd%2FissuedatanodePort=50010namenodeInfoPort=50070}} java.io.FileNotFoundException: File does not exist: / at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsInternal(FSNamesystem.java:834) ... at org.apache.hadoop.hdfs.server.datanode.DatanodeJspHelper.generateFileDetails(DatanodeJspHelper.java:258) at org.apache.hadoop.hdfs.server.datanode.browseBlock_jsp._jspService(browseBlock_jsp.java:79) at org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:97) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1958) Format confirmation prompt should be more lenient of its input
[ https://issues.apache.org/jira/browse/HDFS-1958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13036033#comment-13036033 ] Eli Collins commented on HDFS-1958: --- +1 lgtm Format confirmation prompt should be more lenient of its input -- Key: HDFS-1958 URL: https://issues.apache.org/jira/browse/HDFS-1958 Project: Hadoop HDFS Issue Type: Improvement Components: name-node Affects Versions: 0.22.0 Reporter: Todd Lipcon Assignee: Todd Lipcon Fix For: 0.22.0 Attachments: hdfs-1958.txt As reported on the mailing list, the namenode format prompt only accepts 'Y'. We should also accept 'y' and 'yes' (non-case-sensitive). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1575) viewing block from web UI broken
[ https://issues.apache.org/jira/browse/HDFS-1575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13036037#comment-13036037 ] Hadoop QA commented on HDFS-1575: - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12479729/hdfs-1575-trunk.3.patch against trunk revision 1124459. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 6 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these core unit tests: org.apache.hadoop.hdfs.TestDFSStorageStateRecovery org.apache.hadoop.hdfs.TestFileConcurrentReader org.apache.hadoop.hdfs.TestHDFSTrash org.apache.hadoop.hdfs.TestPipelines org.apache.hadoop.tools.TestJMXGet +1 contrib tests. The patch passed contrib unit tests. +1 system test framework. The patch passed system test framework compile. Test results: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/585//testReport/ Findbugs warnings: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/585//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/585//console This message is automatically generated. viewing block from web UI broken Key: HDFS-1575 URL: https://issues.apache.org/jira/browse/HDFS-1575 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 0.22.0 Reporter: Todd Lipcon Assignee: Aaron T. Myers Priority: Blocker Fix For: 0.22.0 Attachments: HDFS-1575, hdfs-1575-trunk.0.patch, hdfs-1575-trunk.1.patch, hdfs-1575-trunk.2.patch, hdfs-1575-trunk.3.patch DatanodeJspHelper seems to expect the file path to be in the path info of the HttpRequest, rather than in a parameter. I see the following exception when visiting the URL {{http://localhost.localdomain:50075/browseBlock.jsp?blockId=5006108823351810567blockSize=20genstamp=1001filename=%2Fuser%2Ftodd%2FissuedatanodePort=50010namenodeInfoPort=50070}} java.io.FileNotFoundException: File does not exist: / at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsInternal(FSNamesystem.java:834) ... at org.apache.hadoop.hdfs.server.datanode.DatanodeJspHelper.generateFileDetails(DatanodeJspHelper.java:258) at org.apache.hadoop.hdfs.server.datanode.browseBlock_jsp._jspService(browseBlock_jsp.java:79) at org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:97) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-1958) Format confirmation prompt should be more lenient of its input
[ https://issues.apache.org/jira/browse/HDFS-1958?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Todd Lipcon updated HDFS-1958: -- Resolution: Fixed Fix Version/s: (was: 0.22.0) 0.23.0 Hadoop Flags: [Reviewed] Status: Resolved (was: Patch Available) Thanks for review, Eli. I elected to only commit this to trunk since it's a new feature/improvement. Format confirmation prompt should be more lenient of its input -- Key: HDFS-1958 URL: https://issues.apache.org/jira/browse/HDFS-1958 Project: Hadoop HDFS Issue Type: Improvement Components: name-node Affects Versions: 0.22.0 Reporter: Todd Lipcon Assignee: Todd Lipcon Fix For: 0.23.0 Attachments: hdfs-1958.txt As reported on the mailing list, the namenode format prompt only accepts 'Y'. We should also accept 'y' and 'yes' (non-case-sensitive). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1905) Improve the usability of namenode -format
[ https://issues.apache.org/jira/browse/HDFS-1905?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13036043#comment-13036043 ] Hadoop QA commented on HDFS-1905: - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12479685/HDFS-1905-2.patch against trunk revision 1124459. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 4 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these core unit tests: org.apache.hadoop.hdfs.TestDFSStorageStateRecovery org.apache.hadoop.hdfs.TestFileConcurrentReader org.apache.hadoop.tools.TestJMXGet +1 contrib tests. The patch passed contrib unit tests. +1 system test framework. The patch passed system test framework compile. Test results: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/583//testReport/ Findbugs warnings: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/583//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/583//console This message is automatically generated. Improve the usability of namenode -format -- Key: HDFS-1905 URL: https://issues.apache.org/jira/browse/HDFS-1905 Project: Hadoop HDFS Issue Type: Bug Components: name-node Affects Versions: 0.23.0 Reporter: Bharath Mundlapudi Assignee: Bharath Mundlapudi Priority: Minor Fix For: 0.23.0 Attachments: HDFS-1905-1.patch, HDFS-1905-2.patch While setting up 0.23 version based cluster, i ran into this issue. When i issue a format namenode command, which got changed in 23, it should let the user know to how to use this command in case where complete options were not specified. ./hdfs namenode -format I get the following error msg, still its not clear what and how user should use this command. 11/05/09 15:36:25 ERROR namenode.NameNode: java.lang.IllegalArgumentException: Format must be provided with clusterid at org.apache.hadoop.hdfs.server.namenode.NameNode.format(NameNode.java:1483) at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1623) at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1689) The usability of this command can be improved. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1575) viewing block from web UI broken
[ https://issues.apache.org/jira/browse/HDFS-1575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13036045#comment-13036045 ] Todd Lipcon commented on HDFS-1575: --- I tried the failing tests locally and they pass. Will commit to 22 and trunk momentarily. viewing block from web UI broken Key: HDFS-1575 URL: https://issues.apache.org/jira/browse/HDFS-1575 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 0.22.0 Reporter: Todd Lipcon Assignee: Aaron T. Myers Priority: Blocker Fix For: 0.22.0 Attachments: HDFS-1575, hdfs-1575-trunk.0.patch, hdfs-1575-trunk.1.patch, hdfs-1575-trunk.2.patch, hdfs-1575-trunk.3.patch DatanodeJspHelper seems to expect the file path to be in the path info of the HttpRequest, rather than in a parameter. I see the following exception when visiting the URL {{http://localhost.localdomain:50075/browseBlock.jsp?blockId=5006108823351810567blockSize=20genstamp=1001filename=%2Fuser%2Ftodd%2FissuedatanodePort=50010namenodeInfoPort=50070}} java.io.FileNotFoundException: File does not exist: / at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsInternal(FSNamesystem.java:834) ... at org.apache.hadoop.hdfs.server.datanode.DatanodeJspHelper.generateFileDetails(DatanodeJspHelper.java:258) at org.apache.hadoop.hdfs.server.datanode.browseBlock_jsp._jspService(browseBlock_jsp.java:79) at org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:97) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1575) viewing block from web UI broken
[ https://issues.apache.org/jira/browse/HDFS-1575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13036049#comment-13036049 ] Todd Lipcon commented on HDFS-1575: --- Committed to trunk. Looks like we need to alter the patch a little for 0.22 since the federation stuff isn't there. Mind doing that? viewing block from web UI broken Key: HDFS-1575 URL: https://issues.apache.org/jira/browse/HDFS-1575 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 0.22.0 Reporter: Todd Lipcon Assignee: Aaron T. Myers Priority: Blocker Fix For: 0.22.0 Attachments: HDFS-1575, hdfs-1575-trunk.0.patch, hdfs-1575-trunk.1.patch, hdfs-1575-trunk.2.patch, hdfs-1575-trunk.3.patch DatanodeJspHelper seems to expect the file path to be in the path info of the HttpRequest, rather than in a parameter. I see the following exception when visiting the URL {{http://localhost.localdomain:50075/browseBlock.jsp?blockId=5006108823351810567blockSize=20genstamp=1001filename=%2Fuser%2Ftodd%2FissuedatanodePort=50010namenodeInfoPort=50070}} java.io.FileNotFoundException: File does not exist: / at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsInternal(FSNamesystem.java:834) ... at org.apache.hadoop.hdfs.server.datanode.DatanodeJspHelper.generateFileDetails(DatanodeJspHelper.java:258) at org.apache.hadoop.hdfs.server.datanode.browseBlock_jsp._jspService(browseBlock_jsp.java:79) at org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:97) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1953) Change name node mxbean name in cluster web console
[ https://issues.apache.org/jira/browse/HDFS-1953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13036048#comment-13036048 ] Suresh Srinivas commented on HDFS-1953: --- +1 for the patch. This is a simple change in the name of the mxbean. I am not planning to run hudson validation. Change name node mxbean name in cluster web console --- Key: HDFS-1953 URL: https://issues.apache.org/jira/browse/HDFS-1953 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 0.23.0 Reporter: Tanping Wang Assignee: Tanping Wang Priority: Minor Fix For: 0.23.0 Attachments: HDFS-1953-1.patch name node mxbean name is changed after the new metrics framework is checked. Need to change this in ClusterJspHelper.java in order for cluster web console to work again. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HDFS-1953) Change name node mxbean name in cluster web console
[ https://issues.apache.org/jira/browse/HDFS-1953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Suresh Srinivas resolved HDFS-1953. --- Resolution: Fixed Hadoop Flags: [Reviewed] I committed the patch. Thank you Tanping. Change name node mxbean name in cluster web console --- Key: HDFS-1953 URL: https://issues.apache.org/jira/browse/HDFS-1953 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 0.23.0 Reporter: Tanping Wang Assignee: Tanping Wang Priority: Minor Fix For: 0.23.0 Attachments: HDFS-1953-1.patch name node mxbean name is changed after the new metrics framework is checked. Need to change this in ClusterJspHelper.java in order for cluster web console to work again. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1905) Improve the usability of namenode -format
[ https://issues.apache.org/jira/browse/HDFS-1905?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13036052#comment-13036052 ] Hadoop QA commented on HDFS-1905: - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12479685/HDFS-1905-2.patch against trunk revision 1124459. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 4 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these core unit tests: org.apache.hadoop.hdfs.TestDFSStorageStateRecovery org.apache.hadoop.hdfs.TestFileConcurrentReader org.apache.hadoop.hdfs.TestGetBlocks org.apache.hadoop.hdfs.TestHDFSTrash org.apache.hadoop.tools.TestJMXGet +1 contrib tests. The patch passed contrib unit tests. +1 system test framework. The patch passed system test framework compile. Test results: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/584//testReport/ Findbugs warnings: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/584//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/584//console This message is automatically generated. Improve the usability of namenode -format -- Key: HDFS-1905 URL: https://issues.apache.org/jira/browse/HDFS-1905 Project: Hadoop HDFS Issue Type: Bug Components: name-node Affects Versions: 0.23.0 Reporter: Bharath Mundlapudi Assignee: Bharath Mundlapudi Priority: Minor Fix For: 0.23.0 Attachments: HDFS-1905-1.patch, HDFS-1905-2.patch While setting up 0.23 version based cluster, i ran into this issue. When i issue a format namenode command, which got changed in 23, it should let the user know to how to use this command in case where complete options were not specified. ./hdfs namenode -format I get the following error msg, still its not clear what and how user should use this command. 11/05/09 15:36:25 ERROR namenode.NameNode: java.lang.IllegalArgumentException: Format must be provided with clusterid at org.apache.hadoop.hdfs.server.namenode.NameNode.format(NameNode.java:1483) at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1623) at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1689) The usability of this command can be improved. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1875) MiniDFSCluster hard-codes dfs.datanode.address to localhost
[ https://issues.apache.org/jira/browse/HDFS-1875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13036253#comment-13036253 ] Eric Payne commented on HDFS-1875: -- Test failures are not related to this patch. They were failing in several of the previous builds as well. Reference Build #556, for e.g. MiniDFSCluster hard-codes dfs.datanode.address to localhost --- Key: HDFS-1875 URL: https://issues.apache.org/jira/browse/HDFS-1875 Project: Hadoop HDFS Issue Type: Bug Components: test Affects Versions: 0.22.0 Reporter: Eric Payne Assignee: Eric Payne Fix For: 0.23.0 Attachments: HDFS-1875.patch When creating RPC addresses that represent the communication sockets for each simulated DataNode, the MiniDFSCluster class hard-codes the address of the dfs.datanode.address port to be 127.0.0.1:0 The DataNodeCluster test tool uses the MiniDFSCluster class to create a selected number of simulated datanodes on a single host. In the DataNodeCluster setup, the NameNode is not simulated but is started as a separate daemon. The problem is that if the write requrests into the simulated datanodes are originated on a host that is not the same host running the simulated datanodes, the connections are refused. This is because the RPC sockets that are started by MiniDFSCluster are for localhost (127.0.0.1) and are not accessible from outside that same machine. It is proposed that the MiniDFSCluster.setupDatanodeAddress() method be overloaded in order to accommodate an environment where the NameNode is on one host, the client is on another host, and the simulated DataNodes are on yet another host (or even multiple hosts simulating multiple DataNodes each). The overloaded API would add a parameter that would be used as the basis for creating the RPS sockets. By default, it would remain 127.0.0.1 -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HDFS-1962) Enhance MiniDFSCluster to improve testing of network topology distance related issues.
Enhance MiniDFSCluster to improve testing of network topology distance related issues. -- Key: HDFS-1962 URL: https://issues.apache.org/jira/browse/HDFS-1962 Project: Hadoop HDFS Issue Type: Improvement Components: test Affects Versions: 0.22.0 Reporter: Eric Payne Fix For: 0.23.0 In Jira HDFS-1875, Tanping Wang added the following comment. In order to keep the scope of HDFS-1875 small, I have created this Jira to capture this need. - It would be really useful if we can have multiple simulated data nodes binded to different hosts and dfs client binded to a particular host. And futher down the road, some of the simulated data nodes on different hosts, but the same rack. We can use this to test network topology distance related issues. One of the related problem that I ran into was that the order of data nodes in LocatedBlock returned by name nodes is sorted by NetworkTopology#pseudoSortByDistance(). In current Mini dfs cluster, there is no way I can bind the client to a host or bind a simulated data node to a particular host/rack. It would be nice if mini dfs cluster can make this possible, so that the network topology distance of client to each data node is fixed. Therefore, the order of data nodes returned within a LocatedBlock on MiniDFS cluster is fixed. Currently the order of data nodes in LocatedBlock is randomly sorted which means NetworkTopology understand the DFSClient and simulated datanodes are not different hosts and different racks. Also in currently Mini DFS client provides the option of -racks when starting data nodes. But we can not bind multiple simulated data nodes to one rack... so it is not really that useful. - -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1875) MiniDFSCluster hard-codes dfs.datanode.address to localhost
[ https://issues.apache.org/jira/browse/HDFS-1875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13036257#comment-13036257 ] Eric Payne commented on HDFS-1875: -- In order to keep the scope of this Jira small, I have opened HDFS-1962 to cover Tanping's topology enhancement idea. MiniDFSCluster hard-codes dfs.datanode.address to localhost --- Key: HDFS-1875 URL: https://issues.apache.org/jira/browse/HDFS-1875 Project: Hadoop HDFS Issue Type: Bug Components: test Affects Versions: 0.22.0 Reporter: Eric Payne Assignee: Eric Payne Fix For: 0.23.0 Attachments: HDFS-1875.patch When creating RPC addresses that represent the communication sockets for each simulated DataNode, the MiniDFSCluster class hard-codes the address of the dfs.datanode.address port to be 127.0.0.1:0 The DataNodeCluster test tool uses the MiniDFSCluster class to create a selected number of simulated datanodes on a single host. In the DataNodeCluster setup, the NameNode is not simulated but is started as a separate daemon. The problem is that if the write requrests into the simulated datanodes are originated on a host that is not the same host running the simulated datanodes, the connections are refused. This is because the RPC sockets that are started by MiniDFSCluster are for localhost (127.0.0.1) and are not accessible from outside that same machine. It is proposed that the MiniDFSCluster.setupDatanodeAddress() method be overloaded in order to accommodate an environment where the NameNode is on one host, the client is on another host, and the simulated DataNodes are on yet another host (or even multiple hosts simulating multiple DataNodes each). The overloaded API would add a parameter that would be used as the basis for creating the RPS sockets. By default, it would remain 127.0.0.1 -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1869) mkdirs should use the supplied permission for all of the created directories
[ https://issues.apache.org/jira/browse/HDFS-1869?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13036264#comment-13036264 ] Sanjay Radia commented on HDFS-1869: Daryn, have you determined if the semantics of mkdirs changed some time or if this bug always existed. mkdirs should use the supplied permission for all of the created directories Key: HDFS-1869 URL: https://issues.apache.org/jira/browse/HDFS-1869 Project: Hadoop HDFS Issue Type: Bug Components: name-node Affects Versions: 0.23.0 Reporter: Daryn Sharp Assignee: Daryn Sharp Attachments: HDFS-1869-2.patch, HDFS-1869.patch Mkdirs only uses the supplied FsPermission for the last directory of the path. Paths 0..N-1 will all inherit the parent dir's permissions -even if- inheritPermission is false. This is a regression from somewhere around 0.20.9 and does not follow posix semantics. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-1568) Improve DataXceiver error logging
[ https://issues.apache.org/jira/browse/HDFS-1568?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joey Echeverria updated HDFS-1568: -- Attachment: HDFS-1568-output-changes.patch Here's a new patch that only includes the changes that affect output in the logs. The rest of the changes in the original patch do one of two things: 1) Re-format the code to be more consistent. 2) Replace calls to s.getRemoteSocketAddress() in log statements with references to remoteAddress which is set in the constructor. Improve DataXceiver error logging - Key: HDFS-1568 URL: https://issues.apache.org/jira/browse/HDFS-1568 Project: Hadoop HDFS Issue Type: Improvement Components: data-node Affects Versions: 0.23.0 Reporter: Todd Lipcon Assignee: Joey Echeverria Priority: Minor Labels: newbie Attachments: HDFS-1568-1.patch, HDFS-1568-output-changes.patch In supporting customers we often see things like SocketTimeoutExceptions or EOFExceptions coming from DataXceiver, but the logging isn't very good. For example, if we get an IOE while setting up a connection to the downstream mirror in writeBlock, the IP of the downstream mirror isn't logged on the DN side. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1869) mkdirs should use the supplied permission for all of the created directories
[ https://issues.apache.org/jira/browse/HDFS-1869?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13036274#comment-13036274 ] Daryn Sharp commented on HDFS-1869: --- Yes, mkdirs used to be posix compliant, but was subsequently broken. This is directly related to the linked HADOOP bug that mentioned the problem being introduced sometime after 0.20.9. The broken behavior was introduced when another feature was added (my memory is fuzzy, I think it was quotas). mkdirs should use the supplied permission for all of the created directories Key: HDFS-1869 URL: https://issues.apache.org/jira/browse/HDFS-1869 Project: Hadoop HDFS Issue Type: Bug Components: name-node Affects Versions: 0.23.0 Reporter: Daryn Sharp Assignee: Daryn Sharp Attachments: HDFS-1869-2.patch, HDFS-1869.patch Mkdirs only uses the supplied FsPermission for the last directory of the path. Paths 0..N-1 will all inherit the parent dir's permissions -even if- inheritPermission is false. This is a regression from somewhere around 0.20.9 and does not follow posix semantics. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HDFS-1963) HDFS rpm integration project
HDFS rpm integration project Key: HDFS-1963 URL: https://issues.apache.org/jira/browse/HDFS-1963 Project: Hadoop HDFS Issue Type: New Feature Components: build Environment: Java 6, RHEL 5.5 Reporter: Eric Yang Assignee: Eric Yang This jira is corresponding to HADOOP-6255 and associated directory layout change. The patch for creating HDFS rpm packaging should be posted here for patch test build to verify against hdfs svn trunk. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-1963) HDFS rpm integration project
[ https://issues.apache.org/jira/browse/HDFS-1963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Yang updated HDFS-1963: Release Note: Create HDFS RPM package Status: Patch Available (was: Open) HDFS rpm integration project Key: HDFS-1963 URL: https://issues.apache.org/jira/browse/HDFS-1963 Project: Hadoop HDFS Issue Type: New Feature Components: build Environment: Java 6, RHEL 5.5 Reporter: Eric Yang Assignee: Eric Yang Attachments: HDFS-1963.patch This jira is corresponding to HADOOP-6255 and associated directory layout change. The patch for creating HDFS rpm packaging should be posted here for patch test build to verify against hdfs svn trunk. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-1963) HDFS rpm integration project
[ https://issues.apache.org/jira/browse/HDFS-1963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Yang updated HDFS-1963: Attachment: HDFS-1963.patch HDFS rpm integration project Key: HDFS-1963 URL: https://issues.apache.org/jira/browse/HDFS-1963 Project: Hadoop HDFS Issue Type: New Feature Components: build Environment: Java 6, RHEL 5.5 Reporter: Eric Yang Assignee: Eric Yang Attachments: HDFS-1963.patch This jira is corresponding to HADOOP-6255 and associated directory layout change. The patch for creating HDFS rpm packaging should be posted here for patch test build to verify against hdfs svn trunk. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1958) Format confirmation prompt should be more lenient of its input
[ https://issues.apache.org/jira/browse/HDFS-1958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13036311#comment-13036311 ] Tsz Wo (Nicholas), SZE commented on HDFS-1958: -- well, fsck for example supports either 'y' or 'Y' for yes, and 'n' or 'N' for no. fsck and format are different: it is okay to accidentally click y for fsck but not for format. I was asking if this is a good feature in [my previous comment|https://issues.apache.org/jira/browse/HDFS-1958?focusedCommentId=13035881page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13035881]. I tried to change a few years back but I heard one argument saying that the command was deliberately designed for preventing accidentally format. BTW, I think this should be classified as a newbie issue once we have decided to do it. Format confirmation prompt should be more lenient of its input -- Key: HDFS-1958 URL: https://issues.apache.org/jira/browse/HDFS-1958 Project: Hadoop HDFS Issue Type: Improvement Components: name-node Affects Versions: 0.22.0 Reporter: Todd Lipcon Assignee: Todd Lipcon Fix For: 0.23.0 Attachments: hdfs-1958.txt As reported on the mailing list, the namenode format prompt only accepts 'Y'. We should also accept 'y' and 'yes' (non-case-sensitive). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-1371) One bad node can incorrectly flag many files as corrupt
[ https://issues.apache.org/jira/browse/HDFS-1371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tanping Wang updated HDFS-1371: --- Status: Patch Available (was: Open) One bad node can incorrectly flag many files as corrupt --- Key: HDFS-1371 URL: https://issues.apache.org/jira/browse/HDFS-1371 Project: Hadoop HDFS Issue Type: Bug Components: hdfs client, name-node Affects Versions: 0.20.1, 0.23.0 Environment: yahoo internal version [knoguchi@gwgd4003 ~]$ hadoop version Hadoop 0.20.104.3.1007030707 Reporter: Koji Noguchi Assignee: Tanping Wang Fix For: 0.23.0 Attachments: HDFS-1371.04252011.patch, HDFS-1371.0503.patch, HDFS-1371.0513.patch, HDFS-1371.0515.patch, HDFS-1371.0517.2.patch, HDFS-1371.0517.patch, HDFS-1371.0518.2.patch, HDFS-1371.0518.patch On our cluster, 12 files were reported as corrupt by fsck even though the replicas on the datanodes were healthy. Turns out that all the replicas (12 files x 3 replicas per file) were reported corrupt from one node. Surprisingly, these files were still readable/accessible from dfsclient (-get/-cat) without any problems. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1877) Create a functional test for file read/write
[ https://issues.apache.org/jira/browse/HDFS-1877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13036320#comment-13036320 ] Tsz Wo (Nicholas), SZE commented on HDFS-1877: -- - The variables, {{inJunitMode}}, {{BLOCK_SIZE}}, {{dfs}}, are not actually used. Please remove them. - How about the default {{filenameOption}} equals {{ROOT_DIR}}? - You may simply have {{static private Log LOG = LogFactory.getLog(TestWriteRead.class);}} {code} + static private Log LOG; + + @Before + public void initJunitModeTest() throws Exception { +LOG = LogFactory.getLog(TestWriteRead.class); {code} - Please remove the following. The default is already INFO. {code} +((Log4JLogger) FSNamesystem.LOG).getLogger().setLevel(Level.INFO); +((Log4JLogger) DFSClient.LOG).getLogger().setLevel(Level.INFO); {code} - Most public methods should be package private. - Please add comments to tell how to use the command options and the default values. Create a functional test for file read/write Key: HDFS-1877 URL: https://issues.apache.org/jira/browse/HDFS-1877 Project: Hadoop HDFS Issue Type: Test Components: test Affects Versions: 0.22.0 Reporter: CW Chung Priority: Minor Attachments: TestWriteRead.java, TestWriteRead.patch It would be a great to have a tool, running on a real grid, to perform function test (and stress tests to certain extent) for the file operations. The tool would be written in Java and makes HDFS API calls to read, write, append, hflush hadoop files. The tool would be usable standalone, or as a building block for other regression or stress test suites (written in shell, perl, python, etc). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1922) Recurring failure in TestJMXGet.testNameNode since build 477 on May 11
[ https://issues.apache.org/jira/browse/HDFS-1922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13036326#comment-13036326 ] Luke Lu commented on HDFS-1922: --- The only difference between the new behavior and metrics v1 is that in metrics v1, the metrics related mbeans are started whether or not metrics context are configured. In hindsight, I think I should've treated missing config as default/empty config for better compatibility and less surprises. I just opened HADOOP-7306 to revert the metrics system to the old behavior. Recurring failure in TestJMXGet.testNameNode since build 477 on May 11 -- Key: HDFS-1922 URL: https://issues.apache.org/jira/browse/HDFS-1922 Project: Hadoop HDFS Issue Type: Sub-task Components: test Reporter: Matt Foley Assignee: Luke Lu Fix For: 0.23.0 Attachments: hdfs-1922-conf-v1.patch -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1957) Documentation for HFTP
[ https://issues.apache.org/jira/browse/HDFS-1957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13036327#comment-13036327 ] Tsz Wo (Nicholas), SZE commented on HDFS-1957: -- ... Is the current text a bad way to say that? The current text is good. I misread it earlier. +1 patch looks good. Documentation for HFTP -- Key: HDFS-1957 URL: https://issues.apache.org/jira/browse/HDFS-1957 Project: Hadoop HDFS Issue Type: Improvement Components: documentation Affects Versions: 0.23.0 Reporter: Ari Rabkin Assignee: Ari Rabkin Priority: Minor Fix For: 0.23.0 Attachments: HDFS-1957.patch, HDFS-1957.patch, HDFS-1957.patch There should be some documentation for HFTP. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1905) Improve the usability of namenode -format
[ https://issues.apache.org/jira/browse/HDFS-1905?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13036330#comment-13036330 ] Suresh Srinivas commented on HDFS-1905: --- +1 for the patch. Improve the usability of namenode -format -- Key: HDFS-1905 URL: https://issues.apache.org/jira/browse/HDFS-1905 Project: Hadoop HDFS Issue Type: Bug Components: name-node Affects Versions: 0.23.0 Reporter: Bharath Mundlapudi Assignee: Bharath Mundlapudi Priority: Minor Fix For: 0.23.0 Attachments: HDFS-1905-1.patch, HDFS-1905-2.patch While setting up 0.23 version based cluster, i ran into this issue. When i issue a format namenode command, which got changed in 23, it should let the user know to how to use this command in case where complete options were not specified. ./hdfs namenode -format I get the following error msg, still its not clear what and how user should use this command. 11/05/09 15:36:25 ERROR namenode.NameNode: java.lang.IllegalArgumentException: Format must be provided with clusterid at org.apache.hadoop.hdfs.server.namenode.NameNode.format(NameNode.java:1483) at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1623) at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1689) The usability of this command can be improved. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-1371) One bad node can incorrectly flag many files as corrupt
[ https://issues.apache.org/jira/browse/HDFS-1371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tanping Wang updated HDFS-1371: --- Status: Patch Available (was: Open) One bad node can incorrectly flag many files as corrupt --- Key: HDFS-1371 URL: https://issues.apache.org/jira/browse/HDFS-1371 Project: Hadoop HDFS Issue Type: Bug Components: hdfs client, name-node Affects Versions: 0.20.1, 0.23.0 Environment: yahoo internal version [knoguchi@gwgd4003 ~]$ hadoop version Hadoop 0.20.104.3.1007030707 Reporter: Koji Noguchi Assignee: Tanping Wang Fix For: 0.23.0 Attachments: HDFS-1371.04252011.patch, HDFS-1371.0503.patch, HDFS-1371.0513.patch, HDFS-1371.0515.patch, HDFS-1371.0517.2.patch, HDFS-1371.0517.patch, HDFS-1371.0518.2.patch, HDFS-1371.0518.patch On our cluster, 12 files were reported as corrupt by fsck even though the replicas on the datanodes were healthy. Turns out that all the replicas (12 files x 3 replicas per file) were reported corrupt from one node. Surprisingly, these files were still readable/accessible from dfsclient (-get/-cat) without any problems. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-1905) Improve the usability of namenode -format
[ https://issues.apache.org/jira/browse/HDFS-1905?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Suresh Srinivas updated HDFS-1905: -- Resolution: Fixed Status: Resolved (was: Patch Available) I committed the patch. Thank you Bharath. Improve the usability of namenode -format -- Key: HDFS-1905 URL: https://issues.apache.org/jira/browse/HDFS-1905 Project: Hadoop HDFS Issue Type: Bug Components: name-node Affects Versions: 0.23.0 Reporter: Bharath Mundlapudi Assignee: Bharath Mundlapudi Priority: Minor Fix For: 0.23.0 Attachments: HDFS-1905-1.patch, HDFS-1905-2.patch While setting up 0.23 version based cluster, i ran into this issue. When i issue a format namenode command, which got changed in 23, it should let the user know to how to use this command in case where complete options were not specified. ./hdfs namenode -format I get the following error msg, still its not clear what and how user should use this command. 11/05/09 15:36:25 ERROR namenode.NameNode: java.lang.IllegalArgumentException: Format must be provided with clusterid at org.apache.hadoop.hdfs.server.namenode.NameNode.format(NameNode.java:1483) at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1623) at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1689) The usability of this command can be improved. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1958) Format confirmation prompt should be more lenient of its input
[ https://issues.apache.org/jira/browse/HDFS-1958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13036339#comment-13036339 ] Jakob Homan commented on HDFS-1958: --- Less than 24 hours between this issue being opened and committed, on non-critical issues, seems a little short. Perhaps the community should be given more of a chance to weigh in before committing things, particularly when an experienced commmitter is raising questions about it? Format confirmation prompt should be more lenient of its input -- Key: HDFS-1958 URL: https://issues.apache.org/jira/browse/HDFS-1958 Project: Hadoop HDFS Issue Type: Improvement Components: name-node Affects Versions: 0.22.0 Reporter: Todd Lipcon Assignee: Todd Lipcon Fix For: 0.23.0 Attachments: hdfs-1958.txt As reported on the mailing list, the namenode format prompt only accepts 'Y'. We should also accept 'y' and 'yes' (non-case-sensitive). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1057) Concurrent readers hit ChecksumExceptions if following a writer to very end of file
[ https://issues.apache.org/jira/browse/HDFS-1057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13036340#comment-13036340 ] Tsz Wo (Nicholas), SZE commented on HDFS-1057: -- I believe this test mostly fails on the build infrastructure ... It seems that the machines on the build infrastructure are slow/old/heavily loaded. Tests may be easier to fail there but not locally. So that choosing the test parameters, e.g. how many concurrent writer, becomes non-trivial. Concurrent readers hit ChecksumExceptions if following a writer to very end of file --- Key: HDFS-1057 URL: https://issues.apache.org/jira/browse/HDFS-1057 Project: Hadoop HDFS Issue Type: Sub-task Components: data-node Affects Versions: 0.20-append, 0.21.0, 0.22.0 Reporter: Todd Lipcon Assignee: sam rash Priority: Blocker Fix For: 0.20-append, 0.21.0, 0.22.0 Attachments: HDFS-1057-0.20-append.patch, conurrent-reader-patch-1.txt, conurrent-reader-patch-2.txt, conurrent-reader-patch-3.txt, hdfs-1057-trunk-1.txt, hdfs-1057-trunk-2.txt, hdfs-1057-trunk-3.txt, hdfs-1057-trunk-4.txt, hdfs-1057-trunk-5.txt, hdfs-1057-trunk-6.txt In BlockReceiver.receivePacket, it calls replicaInfo.setBytesOnDisk before calling flush(). Therefore, if there is a concurrent reader, it's possible to race here - the reader will see the new length while those bytes are still in the buffers of BlockReceiver. Thus the client will potentially see checksum errors or EOFs. Additionally, the last checksum chunk of the file is made accessible to readers even though it is not stable. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-1941) Remove -genclusterid from NameNode startup options
[ https://issues.apache.org/jira/browse/HDFS-1941?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Suresh Srinivas updated HDFS-1941: -- Component/s: name-node Fix Version/s: 0.23.0 Remove -genclusterid from NameNode startup options -- Key: HDFS-1941 URL: https://issues.apache.org/jira/browse/HDFS-1941 Project: Hadoop HDFS Issue Type: Bug Components: name-node Affects Versions: 0.23.0 Reporter: Bharath Mundlapudi Assignee: Bharath Mundlapudi Priority: Minor Fix For: 0.23.0 Attachments: HDFS-1941-1.patch Currently, namenode -genclusterid is a helper utility to generate unique clusterid. This option is useless once namenode -format automatically generates the clusterid. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-1941) Remove -genclusterid from NameNode startup options
[ https://issues.apache.org/jira/browse/HDFS-1941?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Suresh Srinivas updated HDFS-1941: -- Resolution: Fixed Hadoop Flags: [Reviewed] Status: Resolved (was: Patch Available) I committed the patch. Thank you Bharath. Remove -genclusterid from NameNode startup options -- Key: HDFS-1941 URL: https://issues.apache.org/jira/browse/HDFS-1941 Project: Hadoop HDFS Issue Type: Bug Components: name-node Affects Versions: 0.23.0 Reporter: Bharath Mundlapudi Assignee: Bharath Mundlapudi Priority: Minor Fix For: 0.23.0 Attachments: HDFS-1941-1.patch Currently, namenode -genclusterid is a helper utility to generate unique clusterid. This option is useless once namenode -format automatically generates the clusterid. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1057) Concurrent readers hit ChecksumExceptions if following a writer to very end of file
[ https://issues.apache.org/jira/browse/HDFS-1057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13036343#comment-13036343 ] sam rash commented on HDFS-1057: if it helps, there is only ever 1 writer + 1 reader in the test. 1 reader 'tails' by opening and closing the file repeatedly, up to 1000 times (hence exposing socket leaks in the past) Concurrent readers hit ChecksumExceptions if following a writer to very end of file --- Key: HDFS-1057 URL: https://issues.apache.org/jira/browse/HDFS-1057 Project: Hadoop HDFS Issue Type: Sub-task Components: data-node Affects Versions: 0.20-append, 0.21.0, 0.22.0 Reporter: Todd Lipcon Assignee: sam rash Priority: Blocker Fix For: 0.20-append, 0.21.0, 0.22.0 Attachments: HDFS-1057-0.20-append.patch, conurrent-reader-patch-1.txt, conurrent-reader-patch-2.txt, conurrent-reader-patch-3.txt, hdfs-1057-trunk-1.txt, hdfs-1057-trunk-2.txt, hdfs-1057-trunk-3.txt, hdfs-1057-trunk-4.txt, hdfs-1057-trunk-5.txt, hdfs-1057-trunk-6.txt In BlockReceiver.receivePacket, it calls replicaInfo.setBytesOnDisk before calling flush(). Therefore, if there is a concurrent reader, it's possible to race here - the reader will see the new length while those bytes are still in the buffers of BlockReceiver. Thus the client will potentially see checksum errors or EOFs. Additionally, the last checksum chunk of the file is made accessible to readers even though it is not stable. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1958) Format confirmation prompt should be more lenient of its input
[ https://issues.apache.org/jira/browse/HDFS-1958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13036347#comment-13036347 ] Todd Lipcon commented on HDFS-1958: --- bq. particularly when an experienced commmitter is raising questions about it? Excuse me - I took Nicholas's question for a joke, to be honest, given it referenced high school students and didn't raise technical objections. bq. fsck and format are different: it is okay to accidentally click y for fsck but not for format. OK, another comparison: mke2fs doesn't ask for confirmation at all. I checked this across ext2, ext3, and ntfs. Format confirmation prompt should be more lenient of its input -- Key: HDFS-1958 URL: https://issues.apache.org/jira/browse/HDFS-1958 Project: Hadoop HDFS Issue Type: Improvement Components: name-node Affects Versions: 0.22.0 Reporter: Todd Lipcon Assignee: Todd Lipcon Fix For: 0.23.0 Attachments: hdfs-1958.txt As reported on the mailing list, the namenode format prompt only accepts 'Y'. We should also accept 'y' and 'yes' (non-case-sensitive). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1057) Concurrent readers hit ChecksumExceptions if following a writer to very end of file
[ https://issues.apache.org/jira/browse/HDFS-1057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13036348#comment-13036348 ] Tsz Wo (Nicholas), SZE commented on HDFS-1057: -- Sam, could you either investigate the underlying problem or improve the test so that it won't fail on hudson? Concurrent readers hit ChecksumExceptions if following a writer to very end of file --- Key: HDFS-1057 URL: https://issues.apache.org/jira/browse/HDFS-1057 Project: Hadoop HDFS Issue Type: Sub-task Components: data-node Affects Versions: 0.20-append, 0.21.0, 0.22.0 Reporter: Todd Lipcon Assignee: sam rash Priority: Blocker Fix For: 0.20-append, 0.21.0, 0.22.0 Attachments: HDFS-1057-0.20-append.patch, conurrent-reader-patch-1.txt, conurrent-reader-patch-2.txt, conurrent-reader-patch-3.txt, hdfs-1057-trunk-1.txt, hdfs-1057-trunk-2.txt, hdfs-1057-trunk-3.txt, hdfs-1057-trunk-4.txt, hdfs-1057-trunk-5.txt, hdfs-1057-trunk-6.txt In BlockReceiver.receivePacket, it calls replicaInfo.setBytesOnDisk before calling flush(). Therefore, if there is a concurrent reader, it's possible to race here - the reader will see the new length while those bytes are still in the buffers of BlockReceiver. Thus the client will potentially see checksum errors or EOFs. Additionally, the last checksum chunk of the file is made accessible to readers even though it is not stable. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1922) Recurring failure in TestJMXGet.testNameNode since build 477 on May 11
[ https://issues.apache.org/jira/browse/HDFS-1922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13036349#comment-13036349 ] Todd Lipcon commented on HDFS-1922: --- cool, thanks Luke. +1 on this patch to fix the tests, then. Recurring failure in TestJMXGet.testNameNode since build 477 on May 11 -- Key: HDFS-1922 URL: https://issues.apache.org/jira/browse/HDFS-1922 Project: Hadoop HDFS Issue Type: Sub-task Components: test Reporter: Matt Foley Assignee: Luke Lu Fix For: 0.23.0 Attachments: hdfs-1922-conf-v1.patch -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1958) Format confirmation prompt should be more lenient of its input
[ https://issues.apache.org/jira/browse/HDFS-1958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13036353#comment-13036353 ] Todd Lipcon commented on HDFS-1958: --- btw, for those who might be concerned about accidentally formatting the NN (perhaps your cat likes to jump on the 'y' key and then the enter key), you can also enable HDFS-718 to completely disallow it. Format confirmation prompt should be more lenient of its input -- Key: HDFS-1958 URL: https://issues.apache.org/jira/browse/HDFS-1958 Project: Hadoop HDFS Issue Type: Improvement Components: name-node Affects Versions: 0.22.0 Reporter: Todd Lipcon Assignee: Todd Lipcon Fix For: 0.23.0 Attachments: hdfs-1958.txt As reported on the mailing list, the namenode format prompt only accepts 'Y'. We should also accept 'y' and 'yes' (non-case-sensitive). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1952) FSEditLog.open() appears to succeed even if all EDITS directories fail
[ https://issues.apache.org/jira/browse/HDFS-1952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13036366#comment-13036366 ] Matt Foley commented on HDFS-1952: -- Agree. Maybe change the exception message to Failed to initialize edits log in any storage directory. The test-patch failures are recurring issues unrelated to this patch. FSEditLog.open() appears to succeed even if all EDITS directories fail -- Key: HDFS-1952 URL: https://issues.apache.org/jira/browse/HDFS-1952 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 0.22.0, 0.23.0 Reporter: Matt Foley Assignee: Andrew Wang Labels: newbie Attachments: hdfs-1952.patch FSEditLog.open() appears to succeed even if all of the individual directories failed to allow creation of an EditLogOutputStream. The problem and solution are essentially similar to that of HDFS-1505. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-1922) Recurring failure in TestJMXGet.testNameNode since build 477 on May 11
[ https://issues.apache.org/jira/browse/HDFS-1922?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Todd Lipcon updated HDFS-1922: -- Resolution: Fixed Hadoop Flags: [Reviewed] Status: Resolved (was: Patch Available) Committed to trunk. Thanks, Luke. Recurring failure in TestJMXGet.testNameNode since build 477 on May 11 -- Key: HDFS-1922 URL: https://issues.apache.org/jira/browse/HDFS-1922 Project: Hadoop HDFS Issue Type: Sub-task Components: test Reporter: Matt Foley Assignee: Luke Lu Fix For: 0.23.0 Attachments: hdfs-1922-conf-v1.patch -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-1957) Documentation for HFTP
[ https://issues.apache.org/jira/browse/HDFS-1957?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Todd Lipcon updated HDFS-1957: -- Resolution: Fixed Fix Version/s: (was: 0.23.0) 0.22.0 Hadoop Flags: [Reviewed] Status: Resolved (was: Patch Available) Committed to 22 and trunk. Thanks, Ari! Documentation for HFTP -- Key: HDFS-1957 URL: https://issues.apache.org/jira/browse/HDFS-1957 Project: Hadoop HDFS Issue Type: Improvement Components: documentation Affects Versions: 0.23.0 Reporter: Ari Rabkin Assignee: Ari Rabkin Priority: Minor Fix For: 0.22.0 Attachments: HDFS-1957.patch, HDFS-1957.patch, HDFS-1957.patch There should be some documentation for HFTP. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1505) saveNamespace appears to succeed even if all directories fail to save
[ https://issues.apache.org/jira/browse/HDFS-1505?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13036376#comment-13036376 ] Matt Foley commented on HDFS-1505: -- Reading the HDFS-1073 spec, I infer that fsimage files will have a tag identifying the last txn included in the image, and edits logs will have tags for the first and last txn included in them. And you're referring to the resulting fact that one could take an image ending with txn 100, jump into the middle of a log file that went from txn 50 to 170, and successfully generate the in-memory structures current as of txn 170. Is that right? If the above understanding is correct, then I agree it seems that saveNamespace() should just save the fsimage file. Although it doesn't hurt to also clear the edits logs, once you have multiple copies of the fsimage. Does your log-rolling logic automatically delete log chunk files older than available fsimage files? That would be sufficient edits file management. saveNamespace appears to succeed even if all directories fail to save - Key: HDFS-1505 URL: https://issues.apache.org/jira/browse/HDFS-1505 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 0.22.0, 0.23.0 Reporter: Todd Lipcon Assignee: Aaron T. Myers Priority: Blocker Fix For: 0.22.0 Attachments: hdfs-1505-1-test.txt, hdfs-1505-22.0.patch, hdfs-1505-22.1.patch, hdfs-1505-22.2.patch, hdfs-1505-test.txt, hdfs-1505-trunk.0.patch, hdfs-1505-trunk.1.patch, hdfs-1505-trunk.2.patch, hdfs-1505-trunk.3.patch After HDFS-1071, saveNamespace now appears to succeed even if all of the individual directories failed to save. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1568) Improve DataXceiver error logging
[ https://issues.apache.org/jira/browse/HDFS-1568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13036383#comment-13036383 ] Hadoop QA commented on HDFS-1568: - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12479798/HDFS-1568-output-changes.patch against trunk revision 1124576. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these core unit tests: org.apache.hadoop.hdfs.TestDFSStorageStateRecovery org.apache.hadoop.hdfs.TestFileConcurrentReader org.apache.hadoop.hdfs.TestHDFSTrash org.apache.hadoop.tools.TestJMXGet +1 contrib tests. The patch passed contrib unit tests. +1 system test framework. The patch passed system test framework compile. Test results: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/586//testReport/ Findbugs warnings: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/586//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/586//console This message is automatically generated. Improve DataXceiver error logging - Key: HDFS-1568 URL: https://issues.apache.org/jira/browse/HDFS-1568 Project: Hadoop HDFS Issue Type: Improvement Components: data-node Affects Versions: 0.23.0 Reporter: Todd Lipcon Assignee: Joey Echeverria Priority: Minor Labels: newbie Attachments: HDFS-1568-1.patch, HDFS-1568-output-changes.patch In supporting customers we often see things like SocketTimeoutExceptions or EOFExceptions coming from DataXceiver, but the logging isn't very good. For example, if we get an IOE while setting up a connection to the downstream mirror in writeBlock, the IP of the downstream mirror isn't logged on the DN side. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1505) saveNamespace appears to succeed even if all directories fail to save
[ https://issues.apache.org/jira/browse/HDFS-1505?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13036396#comment-13036396 ] Todd Lipcon commented on HDFS-1505: --- Hey Matt. You're pretty close. bq. And you're referring to the resulting fact that one could take an image ending with txn 100, jump into the middle of a log file that went from txn 50 to 170 In theory, yes. In the current implementation, images are only saved at boundaries of edit log segments. So if you have an image with txn 100, then you'll have some edit log file which starts at 101, so the jump into the middle part isn't necessary. bq. Although it doesn't hurt to also clear the edits logs, once you have multiple copies of the fsimage. Does your log-rolling logic automatically delete log chunk files older than available fsimage files? It's not implemented yet, but the idea is that a separate background thread would be responsible for handling management of old files based on various policies (eg remove old ones, or perhaps archive to some other location) So, sounds like we're in agreement. Thanks. saveNamespace appears to succeed even if all directories fail to save - Key: HDFS-1505 URL: https://issues.apache.org/jira/browse/HDFS-1505 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 0.22.0, 0.23.0 Reporter: Todd Lipcon Assignee: Aaron T. Myers Priority: Blocker Fix For: 0.22.0 Attachments: hdfs-1505-1-test.txt, hdfs-1505-22.0.patch, hdfs-1505-22.1.patch, hdfs-1505-22.2.patch, hdfs-1505-test.txt, hdfs-1505-trunk.0.patch, hdfs-1505-trunk.1.patch, hdfs-1505-trunk.2.patch, hdfs-1505-trunk.3.patch After HDFS-1071, saveNamespace now appears to succeed even if all of the individual directories failed to save. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-420) fuse_dfs is unable to connect to the dfs after a copying a large number of files into the dfs over fuse
[ https://issues.apache.org/jira/browse/HDFS-420?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brian Bockelman updated HDFS-420: - Attachment: fuse_dfs_020_memleaks_v8.patch Ok, I tested this one for awhile prior to posting it. We have been running this on our ~2PB cluster with 250 machines for around 2-3 weeks. No crashes have been reported. No memory leaks are observed. Unit tests pass. Site admins report they are much happier with FUSE fuse_dfs is unable to connect to the dfs after a copying a large number of files into the dfs over fuse --- Key: HDFS-420 URL: https://issues.apache.org/jira/browse/HDFS-420 Project: Hadoop HDFS Issue Type: Bug Components: contrib/fuse-dfs Affects Versions: 0.20.2 Environment: Fedora core 10, x86_64, 2.6.27.7-134.fc10.x86_64 #1 SMP (AMD 64), gcc 4.3.2, java 1.6.0 (IcedTea6 1.4 (fedora-7.b12.fc10-x86_64) Runtime Environment (build 1.6.0_0-b12) OpenJDK 64-Bit Server VM (build 10.0-b19, mixed mode) Reporter: Dima Brodsky Assignee: Brian Bockelman Fix For: 0.20.3 Attachments: fuse_dfs_020_memleaks.patch, fuse_dfs_020_memleaks_v3.patch, fuse_dfs_020_memleaks_v8.patch I run the following test: 1. Run hadoop DFS in single node mode 2. start up fuse_dfs 3. copy my source tree, about 250 megs, into the DFS cp -av * /mnt/hdfs/ in /var/log/messages I keep seeing: Dec 22 09:02:08 bodum fuse_dfs: ERROR: hdfs trying to utime /bar/backend-trunk2/src/machinery/hadoop/output/2008/11/19 to 1229385138/1229963739 and then eventually Dec 22 09:03:49 bodum fuse_dfs: ERROR: could not connect to dfs fuse_dfs.c:1333 Dec 22 09:03:49 bodum fuse_dfs: ERROR: could not connect to dfs fuse_dfs.c:1333 Dec 22 09:03:49 bodum fuse_dfs: ERROR: could not connect to dfs fuse_dfs.c:1037 Dec 22 09:03:49 bodum fuse_dfs: ERROR: could not connect to dfs fuse_dfs.c:1333 Dec 22 09:03:49 bodum fuse_dfs: ERROR: could not connect to dfs fuse_dfs.c:1037 Dec 22 09:03:49 bodum fuse_dfs: ERROR: could not connect to dfs fuse_dfs.c:1333 Dec 22 09:03:49 bodum fuse_dfs: ERROR: could not connect to dfs fuse_dfs.c:1209 Dec 22 09:03:49 bodum fuse_dfs: ERROR: could not connect to dfs fuse_dfs.c:1037 Dec 22 09:03:49 bodum fuse_dfs: ERROR: could not connect to dfs fuse_dfs.c:1037 Dec 22 09:03:49 bodum fuse_dfs: ERROR: could not connect to dfs fuse_dfs.c:1037 Dec 22 09:03:49 bodum fuse_dfs: ERROR: could not connect to dfs fuse_dfs.c:1037 Dec 22 09:03:49 bodum fuse_dfs: ERROR: could not connect to dfs fuse_dfs.c:1037 Dec 22 09:03:49 bodum fuse_dfs: ERROR: could not connect to dfs fuse_dfs.c:1037 Dec 22 09:03:49 bodum fuse_dfs: ERROR: could not connect to dfs fuse_dfs.c:1037 Dec 22 09:03:49 bodum fuse_dfs: ERROR: could not connect to dfs fuse_dfs.c:1037 Dec 22 09:03:49 bodum fuse_dfs: ERROR: could not connect to dfs fuse_dfs.c:1037 Dec 22 09:03:49 bodum fuse_dfs: ERROR: could not connect to dfs fuse_dfs.c:1209 Dec 22 09:03:49 bodum fuse_dfs: ERROR: could not connect to dfs fuse_dfs.c:1037 Dec 22 09:03:49 bodum fuse_dfs: ERROR: could not connect to dfs fuse_dfs.c:1037 Dec 22 09:03:49 bodum fuse_dfs: ERROR: could not connect to dfs fuse_dfs.c:1037 Dec 22 09:03:49 bodum fuse_dfs: ERROR: could not connect to dfs fuse_dfs.c:1333 Dec 22 09:03:49 bodum fuse_dfs: ERROR: could not connect to dfs fuse_dfs.c:1209 Dec 22 09:03:49 bodum fuse_dfs: ERROR: could not connect to dfs fuse_dfs.c:1037 and the file system hangs. hadoop is still running and I don't see any errors in it's logs. I have to unmount the dfs and restart fuse_dfs and then everything is fine again. At some point I see the following messages in the /var/log/messages: ERROR: dfs problem - could not close file_handle(139677114350528) for /bar/backend-trunk2/src/machinery/hadoop/input/2008/12/14/actionrecordlog-8339-93825052368848-1229278807.log fuse_dfs.c:1464 Dec 22 09:04:49 bodum fuse_dfs: ERROR: dfs problem - could not close file_handle(139676770220176) for /bar/backend-trunk2/src/machinery/hadoop/input/2008/12/14/actionrecordlog-8140-93825025883216-1229278759.log fuse_dfs.c:1464 Dec 22 09:05:13 bodum fuse_dfs: ERROR: dfs problem - could not close file_handle(139677114812832) for /bar/backend-trunk2/src/machinery/hadoop/input/2008/12/14/actionrecordlog-8138-93825070138960-1229251587.log fuse_dfs.c:1464 Is this a known issue? Am I just flooding the system too much. All of this is being performed on a single, dual core, machine. Thanks! ttyl Dima -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-420) fuse_dfs is unable to connect to the dfs after a copying a large number of files into the dfs over fuse
[ https://issues.apache.org/jira/browse/HDFS-420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13036405#comment-13036405 ] Hadoop QA commented on HDFS-420: -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12479821/fuse_dfs_020_memleaks_v8.patch against trunk revision 1125057. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/589//console This message is automatically generated. fuse_dfs is unable to connect to the dfs after a copying a large number of files into the dfs over fuse --- Key: HDFS-420 URL: https://issues.apache.org/jira/browse/HDFS-420 Project: Hadoop HDFS Issue Type: Bug Components: contrib/fuse-dfs Affects Versions: 0.20.2 Environment: Fedora core 10, x86_64, 2.6.27.7-134.fc10.x86_64 #1 SMP (AMD 64), gcc 4.3.2, java 1.6.0 (IcedTea6 1.4 (fedora-7.b12.fc10-x86_64) Runtime Environment (build 1.6.0_0-b12) OpenJDK 64-Bit Server VM (build 10.0-b19, mixed mode) Reporter: Dima Brodsky Assignee: Brian Bockelman Fix For: 0.20.3 Attachments: fuse_dfs_020_memleaks.patch, fuse_dfs_020_memleaks_v3.patch, fuse_dfs_020_memleaks_v8.patch I run the following test: 1. Run hadoop DFS in single node mode 2. start up fuse_dfs 3. copy my source tree, about 250 megs, into the DFS cp -av * /mnt/hdfs/ in /var/log/messages I keep seeing: Dec 22 09:02:08 bodum fuse_dfs: ERROR: hdfs trying to utime /bar/backend-trunk2/src/machinery/hadoop/output/2008/11/19 to 1229385138/1229963739 and then eventually Dec 22 09:03:49 bodum fuse_dfs: ERROR: could not connect to dfs fuse_dfs.c:1333 Dec 22 09:03:49 bodum fuse_dfs: ERROR: could not connect to dfs fuse_dfs.c:1333 Dec 22 09:03:49 bodum fuse_dfs: ERROR: could not connect to dfs fuse_dfs.c:1037 Dec 22 09:03:49 bodum fuse_dfs: ERROR: could not connect to dfs fuse_dfs.c:1333 Dec 22 09:03:49 bodum fuse_dfs: ERROR: could not connect to dfs fuse_dfs.c:1037 Dec 22 09:03:49 bodum fuse_dfs: ERROR: could not connect to dfs fuse_dfs.c:1333 Dec 22 09:03:49 bodum fuse_dfs: ERROR: could not connect to dfs fuse_dfs.c:1209 Dec 22 09:03:49 bodum fuse_dfs: ERROR: could not connect to dfs fuse_dfs.c:1037 Dec 22 09:03:49 bodum fuse_dfs: ERROR: could not connect to dfs fuse_dfs.c:1037 Dec 22 09:03:49 bodum fuse_dfs: ERROR: could not connect to dfs fuse_dfs.c:1037 Dec 22 09:03:49 bodum fuse_dfs: ERROR: could not connect to dfs fuse_dfs.c:1037 Dec 22 09:03:49 bodum fuse_dfs: ERROR: could not connect to dfs fuse_dfs.c:1037 Dec 22 09:03:49 bodum fuse_dfs: ERROR: could not connect to dfs fuse_dfs.c:1037 Dec 22 09:03:49 bodum fuse_dfs: ERROR: could not connect to dfs fuse_dfs.c:1037 Dec 22 09:03:49 bodum fuse_dfs: ERROR: could not connect to dfs fuse_dfs.c:1037 Dec 22 09:03:49 bodum fuse_dfs: ERROR: could not connect to dfs fuse_dfs.c:1037 Dec 22 09:03:49 bodum fuse_dfs: ERROR: could not connect to dfs fuse_dfs.c:1209 Dec 22 09:03:49 bodum fuse_dfs: ERROR: could not connect to dfs fuse_dfs.c:1037 Dec 22 09:03:49 bodum fuse_dfs: ERROR: could not connect to dfs fuse_dfs.c:1037 Dec 22 09:03:49 bodum fuse_dfs: ERROR: could not connect to dfs fuse_dfs.c:1037 Dec 22 09:03:49 bodum fuse_dfs: ERROR: could not connect to dfs fuse_dfs.c:1333 Dec 22 09:03:49 bodum fuse_dfs: ERROR: could not connect to dfs fuse_dfs.c:1209 Dec 22 09:03:49 bodum fuse_dfs: ERROR: could not connect to dfs fuse_dfs.c:1037 and the file system hangs. hadoop is still running and I don't see any errors in it's logs. I have to unmount the dfs and restart fuse_dfs and then everything is fine again. At some point I see the following messages in the /var/log/messages: ERROR: dfs problem - could not close file_handle(139677114350528) for /bar/backend-trunk2/src/machinery/hadoop/input/2008/12/14/actionrecordlog-8339-93825052368848-1229278807.log fuse_dfs.c:1464 Dec 22 09:04:49 bodum fuse_dfs: ERROR: dfs problem - could not close file_handle(139676770220176) for /bar/backend-trunk2/src/machinery/hadoop/input/2008/12/14/actionrecordlog-8140-93825025883216-1229278759.log fuse_dfs.c:1464 Dec 22 09:05:13 bodum fuse_dfs: ERROR: dfs problem - could not close file_handle(139677114812832) for
[jira] [Commented] (HDFS-1869) mkdirs should use the supplied permission for all of the created directories
[ https://issues.apache.org/jira/browse/HDFS-1869?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13036409#comment-13036409 ] Todd Lipcon commented on HDFS-1869: --- It would be great to track down which JIRA it was that broke this upstream as well. I don't think it could be quotas since they've been around since 0.17 or so iirc. mkdirs should use the supplied permission for all of the created directories Key: HDFS-1869 URL: https://issues.apache.org/jira/browse/HDFS-1869 Project: Hadoop HDFS Issue Type: Bug Components: name-node Affects Versions: 0.23.0 Reporter: Daryn Sharp Assignee: Daryn Sharp Attachments: HDFS-1869-2.patch, HDFS-1869.patch Mkdirs only uses the supplied FsPermission for the last directory of the path. Paths 0..N-1 will all inherit the parent dir's permissions -even if- inheritPermission is false. This is a regression from somewhere around 0.20.9 and does not follow posix semantics. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1961) New architectural documentation created
[ https://issues.apache.org/jira/browse/HDFS-1961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13036410#comment-13036410 ] Matt Foley commented on HDFS-1961: -- Good start. A few suggestions: Section 4.4: Suggest start with: All communication between Namenode and Datanode is initiated by the Datanode, and responded to by the Namenode. The Namenode never initiates communication to the Datanode, although Namenode responses may include commands to the Datanode that cause it to send further communications. 4.4.2 DataNode Command – send heartbeat. Suggest change to DataNode sends Heartbeat. 4.4.3 DataNodeCommand – block report. Suggest change to DataNode sends BlockReport. 4.4.4 BlockReceived. Suggest change to DataNode notifies BlockReceived. Section 5.2: In the list of NN threads, calling the first one HeartBeat is a little confusing. Please consider calling it something like Datanode Health Management, instead. In the code it is called HeartbeatMonitor, but its job is neither sending nor receiving heartbeats, but rather to periodically check to make sure that every Datanode has sent a heartbeat at least once in the last 10 minutes (or as configured). Should probably also mention the bundle of threads that provide the Namenode's RPC service, which receives and processes all 13 kinds of communication from Datanodes and Clients. Section 5.3: This [blockReceived notification] may prevent NameNode temporarily from asking for a full block report since the receipt of a blockReceived() message indicates that the DataNode is still alive. That sentence isn't correct, since it is relatively unusual for the NN to ask the DN for a block report. (It only happens when recovering from gross errors.) Instead, suggest including in this section a brief discussion of the fact that the DN sends a heartbeat to the NN every 3 seconds (or as configured), which allows the NN a chance to respond with commands such as * delete replica if a block has become over-replicated, or * copy replica to this other DN if a block needs further replication. And the DN initiates a BlockReport to the NN every hour (or as configured), which prevents any divergence in the NN and DN belief about which replicas are held by each datanode. And yes, it also sends an immediate blockReceived notification whenever it receives a new block, whether from a Client (file create/append), or from another Datanode (block replication). A blockReport() is also issued periodically as a portion of the HeartBeat. Not exactly. The DN's heartbeat thread takes care of sending both, at the appropriate time intervals, but they are separate RPCs to the NN. New architectural documentation created --- Key: HDFS-1961 URL: https://issues.apache.org/jira/browse/HDFS-1961 Project: Hadoop HDFS Issue Type: Improvement Components: documentation Affects Versions: 0.21.0 Reporter: Rick Kazman Labels: architecture, hadoop, newbie Fix For: 0.21.0 Attachments: HDFS ArchDoc.Jira.docx This material provides an overview of the HDFS architecture and is intended for contributors. The goal of this document is to provide a guide to the overall structure of the HDFS code so that contributors can more effectively understand how changes that they are considering can be made, and the consequences of those changes. The assumption is that the reader has a basic understanding of HDFS, its purpose, and how it fits into the Hadoop project suite. An HTML version of the architectural documentation can be found at: http://kazman.shidler.hawaii.edu/ArchDoc.html All comments and suggestions for improvements are appreciated. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1958) Format confirmation prompt should be more lenient of its input
[ https://issues.apache.org/jira/browse/HDFS-1958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13036413#comment-13036413 ] Tsz Wo (Nicholas), SZE commented on HDFS-1958: -- Excuse me - I took Nicholas's question for a joke, to be honest, given it referenced high school students and didn't raise technical objections. It is a half joke. :) However, it is not as convincing as suggested to change a feature affecting user behavior simply because someone has reported on the mailing list. Format confirmation prompt should be more lenient of its input -- Key: HDFS-1958 URL: https://issues.apache.org/jira/browse/HDFS-1958 Project: Hadoop HDFS Issue Type: Improvement Components: name-node Affects Versions: 0.22.0 Reporter: Todd Lipcon Assignee: Todd Lipcon Fix For: 0.23.0 Attachments: hdfs-1958.txt As reported on the mailing list, the namenode format prompt only accepts 'Y'. We should also accept 'y' and 'yes' (non-case-sensitive). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1958) Format confirmation prompt should be more lenient of its input
[ https://issues.apache.org/jira/browse/HDFS-1958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13036416#comment-13036416 ] Tsz Wo (Nicholas), SZE commented on HDFS-1958: -- Todd, do you agree that this is a newbie issue? Format confirmation prompt should be more lenient of its input -- Key: HDFS-1958 URL: https://issues.apache.org/jira/browse/HDFS-1958 Project: Hadoop HDFS Issue Type: Improvement Components: name-node Affects Versions: 0.22.0 Reporter: Todd Lipcon Assignee: Todd Lipcon Fix For: 0.23.0 Attachments: hdfs-1958.txt As reported on the mailing list, the namenode format prompt only accepts 'Y'. We should also accept 'y' and 'yes' (non-case-sensitive). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1958) Format confirmation prompt should be more lenient of its input
[ https://issues.apache.org/jira/browse/HDFS-1958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13036419#comment-13036419 ] Todd Lipcon commented on HDFS-1958: --- bq. However, it is not as convincing as suggested to change a feature affecting user behavior simply because someone has reported on the mailing list. Fair enough. I'll try to reproduce the reasoning from the mailing list in the future. bq. Todd, do you agree that this is a newbie issue Yes. But I already addressed it so no need to retroactively tag it as such (IMO the point of the newbie label is just to help new contributors find open JIRAs that might be easy to start with). Format confirmation prompt should be more lenient of its input -- Key: HDFS-1958 URL: https://issues.apache.org/jira/browse/HDFS-1958 Project: Hadoop HDFS Issue Type: Improvement Components: name-node Affects Versions: 0.22.0 Reporter: Todd Lipcon Assignee: Todd Lipcon Fix For: 0.23.0 Attachments: hdfs-1958.txt As reported on the mailing list, the namenode format prompt only accepts 'Y'. We should also accept 'y' and 'yes' (non-case-sensitive). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (HDFS-1935) Build should not redownload ivy on every invocation
[ https://issues.apache.org/jira/browse/HDFS-1935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Todd Lipcon reassigned HDFS-1935: - Assignee: (was: Todd Lipcon) Build should not redownload ivy on every invocation --- Key: HDFS-1935 URL: https://issues.apache.org/jira/browse/HDFS-1935 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 0.22.0 Reporter: Todd Lipcon Priority: Trivial Labels: newbie Fix For: 0.22.0 Attachments: hdfs-1935.txt Currently we re-download ivy every time we build. If the jar already exists, we should skip this. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1371) One bad node can incorrectly flag many files as corrupt
[ https://issues.apache.org/jira/browse/HDFS-1371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13036430#comment-13036430 ] Hadoop QA commented on HDFS-1371: - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12479689/HDFS-1371.0518.2.patch against trunk revision 1125057. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 11 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these core unit tests: org.apache.hadoop.hdfs.TestDFSStorageStateRecovery org.apache.hadoop.hdfs.TestFileConcurrentReader org.apache.hadoop.hdfs.TestHDFSTrash +1 contrib tests. The patch passed contrib unit tests. +1 system test framework. The patch passed system test framework compile. Test results: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/588//testReport/ Findbugs warnings: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/588//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/588//console This message is automatically generated. One bad node can incorrectly flag many files as corrupt --- Key: HDFS-1371 URL: https://issues.apache.org/jira/browse/HDFS-1371 Project: Hadoop HDFS Issue Type: Bug Components: hdfs client, name-node Affects Versions: 0.20.1, 0.23.0 Environment: yahoo internal version [knoguchi@gwgd4003 ~]$ hadoop version Hadoop 0.20.104.3.1007030707 Reporter: Koji Noguchi Assignee: Tanping Wang Fix For: 0.23.0 Attachments: HDFS-1371.04252011.patch, HDFS-1371.0503.patch, HDFS-1371.0513.patch, HDFS-1371.0515.patch, HDFS-1371.0517.2.patch, HDFS-1371.0517.patch, HDFS-1371.0518.2.patch, HDFS-1371.0518.patch On our cluster, 12 files were reported as corrupt by fsck even though the replicas on the datanodes were healthy. Turns out that all the replicas (12 files x 3 replicas per file) were reported corrupt from one node. Surprisingly, these files were still readable/accessible from dfsclient (-get/-cat) without any problems. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1568) Improve DataXceiver error logging
[ https://issues.apache.org/jira/browse/HDFS-1568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13036434#comment-13036434 ] Tsz Wo (Nicholas), SZE commented on HDFS-1568: -- - Could you not reformat the message in this? Otherwise, it is hard to review the patch. You may fix the message format in a separated JIRA. {code} - block + to + -s.getInetAddress() + :\n + -StringUtils.stringifyException(ioe) ); + block + to + + remoteAddress + :\n + + StringUtils.stringifyException(ioe) ); {code} Improve DataXceiver error logging - Key: HDFS-1568 URL: https://issues.apache.org/jira/browse/HDFS-1568 Project: Hadoop HDFS Issue Type: Improvement Components: data-node Affects Versions: 0.23.0 Reporter: Todd Lipcon Assignee: Joey Echeverria Priority: Minor Labels: newbie Attachments: HDFS-1568-1.patch, HDFS-1568-output-changes.patch In supporting customers we often see things like SocketTimeoutExceptions or EOFExceptions coming from DataXceiver, but the logging isn't very good. For example, if we get an IOE while setting up a connection to the downstream mirror in writeBlock, the IP of the downstream mirror isn't logged on the DN side. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1371) One bad node can incorrectly flag many files as corrupt
[ https://issues.apache.org/jira/browse/HDFS-1371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13036444#comment-13036444 ] Tanping Wang commented on HDFS-1371: These three tests are already failing on trunk. One bad node can incorrectly flag many files as corrupt --- Key: HDFS-1371 URL: https://issues.apache.org/jira/browse/HDFS-1371 Project: Hadoop HDFS Issue Type: Bug Components: hdfs client, name-node Affects Versions: 0.20.1, 0.23.0 Environment: yahoo internal version [knoguchi@gwgd4003 ~]$ hadoop version Hadoop 0.20.104.3.1007030707 Reporter: Koji Noguchi Assignee: Tanping Wang Fix For: 0.23.0 Attachments: HDFS-1371.04252011.patch, HDFS-1371.0503.patch, HDFS-1371.0513.patch, HDFS-1371.0515.patch, HDFS-1371.0517.2.patch, HDFS-1371.0517.patch, HDFS-1371.0518.2.patch, HDFS-1371.0518.patch On our cluster, 12 files were reported as corrupt by fsck even though the replicas on the datanodes were healthy. Turns out that all the replicas (12 files x 3 replicas per file) were reported corrupt from one node. Surprisingly, these files were still readable/accessible from dfsclient (-get/-cat) without any problems. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1963) HDFS rpm integration project
[ https://issues.apache.org/jira/browse/HDFS-1963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13036446#comment-13036446 ] Hadoop QA commented on HDFS-1963: - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12479801/HDFS-1963.patch against trunk revision 1125057. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 14 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. -1 release audit. The applied patch generated 2 release audit warnings (more than the trunk's current 0 warnings). -1 core tests. The patch failed these core unit tests: org.apache.hadoop.cli.TestHDFSCLI org.apache.hadoop.hdfs.TestDFSStorageStateRecovery org.apache.hadoop.hdfs.TestFileConcurrentReader org.apache.hadoop.hdfs.TestHDFSTrash +1 contrib tests. The patch passed contrib unit tests. +1 system test framework. The patch passed system test framework compile. Test results: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/587//testReport/ Release audit warnings: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/587//artifact/trunk/patchprocess/patchReleaseAuditProblems.txt Findbugs warnings: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/587//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/587//console This message is automatically generated. HDFS rpm integration project Key: HDFS-1963 URL: https://issues.apache.org/jira/browse/HDFS-1963 Project: Hadoop HDFS Issue Type: New Feature Components: build Environment: Java 6, RHEL 5.5 Reporter: Eric Yang Assignee: Eric Yang Attachments: HDFS-1963.patch This jira is corresponding to HADOOP-6255 and associated directory layout change. The patch for creating HDFS rpm packaging should be posted here for patch test build to verify against hdfs svn trunk. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1057) Concurrent readers hit ChecksumExceptions if following a writer to very end of file
[ https://issues.apache.org/jira/browse/HDFS-1057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13036453#comment-13036453 ] Todd Lipcon commented on HDFS-1057: --- Sam seems to be correct that there's some kind of leak going on. lsof on the java process shows several hundred unix sockets open. Concurrent readers hit ChecksumExceptions if following a writer to very end of file --- Key: HDFS-1057 URL: https://issues.apache.org/jira/browse/HDFS-1057 Project: Hadoop HDFS Issue Type: Sub-task Components: data-node Affects Versions: 0.20-append, 0.21.0, 0.22.0 Reporter: Todd Lipcon Assignee: sam rash Priority: Blocker Fix For: 0.20-append, 0.21.0, 0.22.0 Attachments: HDFS-1057-0.20-append.patch, conurrent-reader-patch-1.txt, conurrent-reader-patch-2.txt, conurrent-reader-patch-3.txt, hdfs-1057-trunk-1.txt, hdfs-1057-trunk-2.txt, hdfs-1057-trunk-3.txt, hdfs-1057-trunk-4.txt, hdfs-1057-trunk-5.txt, hdfs-1057-trunk-6.txt In BlockReceiver.receivePacket, it calls replicaInfo.setBytesOnDisk before calling flush(). Therefore, if there is a concurrent reader, it's possible to race here - the reader will see the new length while those bytes are still in the buffers of BlockReceiver. Thus the client will potentially see checksum errors or EOFs. Additionally, the last checksum chunk of the file is made accessible to readers even though it is not stable. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1057) Concurrent readers hit ChecksumExceptions if following a writer to very end of file
[ https://issues.apache.org/jira/browse/HDFS-1057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13036461#comment-13036461 ] sam rash commented on HDFS-1057: todd: thanks for digging into this Concurrent readers hit ChecksumExceptions if following a writer to very end of file --- Key: HDFS-1057 URL: https://issues.apache.org/jira/browse/HDFS-1057 Project: Hadoop HDFS Issue Type: Sub-task Components: data-node Affects Versions: 0.20-append, 0.21.0, 0.22.0 Reporter: Todd Lipcon Assignee: sam rash Priority: Blocker Fix For: 0.20-append, 0.21.0, 0.22.0 Attachments: HDFS-1057-0.20-append.patch, conurrent-reader-patch-1.txt, conurrent-reader-patch-2.txt, conurrent-reader-patch-3.txt, hdfs-1057-trunk-1.txt, hdfs-1057-trunk-2.txt, hdfs-1057-trunk-3.txt, hdfs-1057-trunk-4.txt, hdfs-1057-trunk-5.txt, hdfs-1057-trunk-6.txt In BlockReceiver.receivePacket, it calls replicaInfo.setBytesOnDisk before calling flush(). Therefore, if there is a concurrent reader, it's possible to race here - the reader will see the new length while those bytes are still in the buffers of BlockReceiver. Thus the client will potentially see checksum errors or EOFs. Additionally, the last checksum chunk of the file is made accessible to readers even though it is not stable. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-1906) Remove logging exception stack trace when one of the datanode targets to read from is not reachable
[ https://issues.apache.org/jira/browse/HDFS-1906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Suresh Srinivas updated HDFS-1906: -- Attachment: HDFS-1906.rel205.patch Patch for 0.20.205 Remove logging exception stack trace when one of the datanode targets to read from is not reachable --- Key: HDFS-1906 URL: https://issues.apache.org/jira/browse/HDFS-1906 Project: Hadoop HDFS Issue Type: Improvement Components: hdfs client Affects Versions: 0.20.203.1 Reporter: Suresh Srinivas Assignee: Suresh Srinivas Priority: Minor Fix For: 0.23.0 Attachments: HDFS-1906.2.patch, HDFS-1906.patch, HDFS-1906.rel205.patch When client fails to connect to one of the datanodes from the list of block locations returned, exception stack trace is printed in the client log. This is an expected failure scenario that is handled at the client, by going to the next location. Printing entire stack trace is unnecessary and just printing the exception message should be sufficient. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1958) Format confirmation prompt should be more lenient of its input
[ https://issues.apache.org/jira/browse/HDFS-1958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13036477#comment-13036477 ] Tsz Wo (Nicholas), SZE commented on HDFS-1958: -- OK, another comparison: mke2fs doesn't ask for confirmation at all. I checked this across ext2, ext3, and ntfs. Not asking for confirmation and case insensitive are also different. Moreover, since the -format command is there for years, I wonder if there are some admins already taking advantages of the fact that 'y' won't format. For example, the 'yes' command outputs lower case y's. Yes. But I already addressed it so no need to retroactively tag it as such (IMO the point of the newbie label is just to help new contributors find open JIRAs that might be easy to start with). Okay, I think you like to leave the easy issues for the new contributors. Format confirmation prompt should be more lenient of its input -- Key: HDFS-1958 URL: https://issues.apache.org/jira/browse/HDFS-1958 Project: Hadoop HDFS Issue Type: Improvement Components: name-node Affects Versions: 0.22.0 Reporter: Todd Lipcon Assignee: Todd Lipcon Fix For: 0.23.0 Attachments: hdfs-1958.txt As reported on the mailing list, the namenode format prompt only accepts 'Y'. We should also accept 'y' and 'yes' (non-case-sensitive). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1906) Remove logging exception stack trace when one of the datanode targets to read from is not reachable
[ https://issues.apache.org/jira/browse/HDFS-1906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13036485#comment-13036485 ] Tsz Wo (Nicholas), SZE commented on HDFS-1906: -- Patch for 0.20.205 +1 Remove logging exception stack trace when one of the datanode targets to read from is not reachable --- Key: HDFS-1906 URL: https://issues.apache.org/jira/browse/HDFS-1906 Project: Hadoop HDFS Issue Type: Improvement Components: hdfs client Affects Versions: 0.20.203.1 Reporter: Suresh Srinivas Assignee: Suresh Srinivas Priority: Minor Fix For: 0.23.0 Attachments: HDFS-1906.2.patch, HDFS-1906.patch, HDFS-1906.rel205.patch When client fails to connect to one of the datanodes from the list of block locations returned, exception stack trace is printed in the client log. This is an expected failure scenario that is handled at the client, by going to the next location. Printing entire stack trace is unnecessary and just printing the exception message should be sufficient. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HDFS-1964) Incorrect HTML unescaping in DatanodeJspHelper.java
Incorrect HTML unescaping in DatanodeJspHelper.java --- Key: HDFS-1964 URL: https://issues.apache.org/jira/browse/HDFS-1964 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 0.22.0, 0.23.0 Reporter: Aaron T. Myers Assignee: Aaron T. Myers Fix For: 0.22.0, 0.23.0 HDFS-1575 introduced some HTML unescaping of parameters so that viewing a file would work for paths containing HTML-escaped characters, but in two of the places did the unescaping either too early or too late. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1371) One bad node can incorrectly flag many files as corrupt
[ https://issues.apache.org/jira/browse/HDFS-1371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13036491#comment-13036491 ] Jitendra Nath Pandey commented on HDFS-1371: +1 One bad node can incorrectly flag many files as corrupt --- Key: HDFS-1371 URL: https://issues.apache.org/jira/browse/HDFS-1371 Project: Hadoop HDFS Issue Type: Bug Components: hdfs client, name-node Affects Versions: 0.20.1, 0.23.0 Environment: yahoo internal version [knoguchi@gwgd4003 ~]$ hadoop version Hadoop 0.20.104.3.1007030707 Reporter: Koji Noguchi Assignee: Tanping Wang Fix For: 0.23.0 Attachments: HDFS-1371.04252011.patch, HDFS-1371.0503.patch, HDFS-1371.0513.patch, HDFS-1371.0515.patch, HDFS-1371.0517.2.patch, HDFS-1371.0517.patch, HDFS-1371.0518.2.patch, HDFS-1371.0518.patch On our cluster, 12 files were reported as corrupt by fsck even though the replicas on the datanodes were healthy. Turns out that all the replicas (12 files x 3 replicas per file) were reported corrupt from one node. Surprisingly, these files were still readable/accessible from dfsclient (-get/-cat) without any problems. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1057) Concurrent readers hit ChecksumExceptions if following a writer to very end of file
[ https://issues.apache.org/jira/browse/HDFS-1057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13036492#comment-13036492 ] Todd Lipcon commented on HDFS-1057: --- Actually, it looks like the leak in 7146 pushed this over the edge. But, even with that patch, if I lsof the java process as it runs, I see it hit 800 or so localhost TCP connections in ESTABLISHED state while running this test case. So, needs more investigation yet. Concurrent readers hit ChecksumExceptions if following a writer to very end of file --- Key: HDFS-1057 URL: https://issues.apache.org/jira/browse/HDFS-1057 Project: Hadoop HDFS Issue Type: Sub-task Components: data-node Affects Versions: 0.20-append, 0.21.0, 0.22.0 Reporter: Todd Lipcon Assignee: sam rash Priority: Blocker Fix For: 0.20-append, 0.21.0, 0.22.0 Attachments: HDFS-1057-0.20-append.patch, conurrent-reader-patch-1.txt, conurrent-reader-patch-2.txt, conurrent-reader-patch-3.txt, hdfs-1057-trunk-1.txt, hdfs-1057-trunk-2.txt, hdfs-1057-trunk-3.txt, hdfs-1057-trunk-4.txt, hdfs-1057-trunk-5.txt, hdfs-1057-trunk-6.txt In BlockReceiver.receivePacket, it calls replicaInfo.setBytesOnDisk before calling flush(). Therefore, if there is a concurrent reader, it's possible to race here - the reader will see the new length while those bytes are still in the buffers of BlockReceiver. Thus the client will potentially see checksum errors or EOFs. Additionally, the last checksum chunk of the file is made accessible to readers even though it is not stable. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1952) FSEditLog.open() appears to succeed even if all EDITS directories fail
[ https://issues.apache.org/jira/browse/HDFS-1952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13036493#comment-13036493 ] Hadoop QA commented on HDFS-1952: - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12479839/hdfs-1952.patch against trunk revision 1125057. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/591//console This message is automatically generated. FSEditLog.open() appears to succeed even if all EDITS directories fail -- Key: HDFS-1952 URL: https://issues.apache.org/jira/browse/HDFS-1952 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 0.22.0, 0.23.0 Reporter: Matt Foley Assignee: Andrew Wang Labels: newbie Attachments: hdfs-1952.patch, hdfs-1952.patch FSEditLog.open() appears to succeed even if all of the individual directories failed to allow creation of an EditLogOutputStream. The problem and solution are essentially similar to that of HDFS-1505. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-1575) viewing block from web UI broken
[ https://issues.apache.org/jira/browse/HDFS-1575?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aaron T. Myers updated HDFS-1575: - Attachment: hdfs-1575-22.0.patch What's attached is a faithful back-port of the trunk commit. In the course of doing this back-port I identified a bug, which I've filed under HDFS-1964. Let's commit this patch to branch-0.22 and then I'll back-port the bug fix. viewing block from web UI broken Key: HDFS-1575 URL: https://issues.apache.org/jira/browse/HDFS-1575 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 0.22.0 Reporter: Todd Lipcon Assignee: Aaron T. Myers Priority: Blocker Fix For: 0.22.0 Attachments: HDFS-1575, hdfs-1575-22.0.patch, hdfs-1575-trunk.0.patch, hdfs-1575-trunk.1.patch, hdfs-1575-trunk.2.patch, hdfs-1575-trunk.3.patch DatanodeJspHelper seems to expect the file path to be in the path info of the HttpRequest, rather than in a parameter. I see the following exception when visiting the URL {{http://localhost.localdomain:50075/browseBlock.jsp?blockId=5006108823351810567blockSize=20genstamp=1001filename=%2Fuser%2Ftodd%2FissuedatanodePort=50010namenodeInfoPort=50070}} java.io.FileNotFoundException: File does not exist: / at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsInternal(FSNamesystem.java:834) ... at org.apache.hadoop.hdfs.server.datanode.DatanodeJspHelper.generateFileDetails(DatanodeJspHelper.java:258) at org.apache.hadoop.hdfs.server.datanode.browseBlock_jsp._jspService(browseBlock_jsp.java:79) at org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:97) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-593) Support for getting user home dir from server side
[ https://issues.apache.org/jira/browse/HDFS-593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13036503#comment-13036503 ] Sanjay Radia commented on HDFS-593: --- Lets revist this issue. When writing the tests for viewfs's trashbin, (see HADOOP-7284), I had to take into account that the tests ran on mac or linux boxes or hdfs each of which have different notions of the home directory. When this Jira was filed, the proposal was that the home dir is SS config. Made sense to me. With viewfs (ie client-side mount table) there is no server side and further the client side mount table points to multiple file servers. Since viewfs is configured via config variables it is quite easy to add a config variable for this. I proposed that in HADOOP-7284 and Todd aggreed. But I think this topic deserves a fresh look: * For HDFS, homme-dir is a SS property with a default of /user, for viewfs determined from viewfs's config, and for localfs, figured out dynamically. * Home dir is config variable with a default of /user and for viewfs determined from its config so that it can adapt to mounts of localfs and hdfs. Support for getting user home dir from server side -- Key: HDFS-593 URL: https://issues.apache.org/jira/browse/HDFS-593 Project: Hadoop HDFS Issue Type: Improvement Components: hdfs client, name-node Reporter: Kan Zhang This is a sub-task of HADOOP-4952. Currently the Path of user home dir is constructed on the client side using convention /user/$USER. HADOOP-4952 calls for it to be retrieved from server side. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-1952) FSEditLog.open() appears to succeed even if all EDITS directories fail
[ https://issues.apache.org/jira/browse/HDFS-1952?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Wang updated HDFS-1952: -- Attachment: hdfs-1952.patch Used --strip-prefix this time. Tested application to trunk with patch -p0 hdfs-1952.patch, hopefully Hudson likes it. FSEditLog.open() appears to succeed even if all EDITS directories fail -- Key: HDFS-1952 URL: https://issues.apache.org/jira/browse/HDFS-1952 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 0.22.0, 0.23.0 Reporter: Matt Foley Assignee: Andrew Wang Labels: newbie Attachments: hdfs-1952.patch, hdfs-1952.patch, hdfs-1952.patch FSEditLog.open() appears to succeed even if all of the individual directories failed to allow creation of an EditLogOutputStream. The problem and solution are essentially similar to that of HDFS-1505. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1575) viewing block from web UI broken
[ https://issues.apache.org/jira/browse/HDFS-1575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13036507#comment-13036507 ] Hadoop QA commented on HDFS-1575: - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12479841/hdfs-1575-22.0.patch against trunk revision 1125057. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 5 new or modified tests. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/593//console This message is automatically generated. viewing block from web UI broken Key: HDFS-1575 URL: https://issues.apache.org/jira/browse/HDFS-1575 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 0.22.0 Reporter: Todd Lipcon Assignee: Aaron T. Myers Priority: Blocker Fix For: 0.22.0 Attachments: HDFS-1575, hdfs-1575-22.0.patch, hdfs-1575-trunk.0.patch, hdfs-1575-trunk.1.patch, hdfs-1575-trunk.2.patch, hdfs-1575-trunk.3.patch DatanodeJspHelper seems to expect the file path to be in the path info of the HttpRequest, rather than in a parameter. I see the following exception when visiting the URL {{http://localhost.localdomain:50075/browseBlock.jsp?blockId=5006108823351810567blockSize=20genstamp=1001filename=%2Fuser%2Ftodd%2FissuedatanodePort=50010namenodeInfoPort=50070}} java.io.FileNotFoundException: File does not exist: / at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsInternal(FSNamesystem.java:834) ... at org.apache.hadoop.hdfs.server.datanode.DatanodeJspHelper.generateFileDetails(DatanodeJspHelper.java:258) at org.apache.hadoop.hdfs.server.datanode.browseBlock_jsp._jspService(browseBlock_jsp.java:79) at org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:97) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-1964) Incorrect HTML unescaping in DatanodeJspHelper.java
[ https://issues.apache.org/jira/browse/HDFS-1964?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aaron T. Myers updated HDFS-1964: - Attachment: hdfs-1964-trunk.0.patch Patch addressing the issue. Incorrect HTML unescaping in DatanodeJspHelper.java --- Key: HDFS-1964 URL: https://issues.apache.org/jira/browse/HDFS-1964 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 0.22.0, 0.23.0 Reporter: Aaron T. Myers Assignee: Aaron T. Myers Fix For: 0.22.0, 0.23.0 Attachments: hdfs-1964-trunk.0.patch HDFS-1575 introduced some HTML unescaping of parameters so that viewing a file would work for paths containing HTML-escaped characters, but in two of the places did the unescaping either too early or too late. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-1964) Incorrect HTML unescaping in DatanodeJspHelper.java
[ https://issues.apache.org/jira/browse/HDFS-1964?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aaron T. Myers updated HDFS-1964: - Status: Patch Available (was: Open) Incorrect HTML unescaping in DatanodeJspHelper.java --- Key: HDFS-1964 URL: https://issues.apache.org/jira/browse/HDFS-1964 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 0.22.0, 0.23.0 Reporter: Aaron T. Myers Assignee: Aaron T. Myers Fix For: 0.22.0, 0.23.0 Attachments: hdfs-1964-trunk.0.patch HDFS-1575 introduced some HTML unescaping of parameters so that viewing a file would work for paths containing HTML-escaped characters, but in two of the places did the unescaping either too early or too late. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-1963) HDFS rpm integration project
[ https://issues.apache.org/jira/browse/HDFS-1963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Yang updated HDFS-1963: Attachment: HDFS-1963-1.patch Store config templates in $PREFIX/share/hadoop/templates, and change related script to use the new location. HDFS rpm integration project Key: HDFS-1963 URL: https://issues.apache.org/jira/browse/HDFS-1963 Project: Hadoop HDFS Issue Type: New Feature Components: build Environment: Java 6, RHEL 5.5 Reporter: Eric Yang Assignee: Eric Yang Attachments: HDFS-1963-1.patch, HDFS-1963.patch This jira is corresponding to HADOOP-6255 and associated directory layout change. The patch for creating HDFS rpm packaging should be posted here for patch test build to verify against hdfs svn trunk. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1057) Concurrent readers hit ChecksumExceptions if following a writer to very end of file
[ https://issues.apache.org/jira/browse/HDFS-1057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13036525#comment-13036525 ] Todd Lipcon commented on HDFS-1057: --- aha! I think I understand what's going on here! The test has a thread which continually re-opens the file which is being written to. Since the file's in the middle of being written, it makes an RPC to the DataNode in order to determine the visible length of the file. This RPC is authenticated using the block token which came back in the LocatedBlocks object as the security ticket. When this RPC hits the IPC layer, it looks at its existing connections and sees none that can be re-used, since the block token differs between the two requesters. Hence, it reconnects, and we end up with hundreds or thousands of IPC connections to the datanode. This also explains why Sam doesn't see it on his 0.20 append branch -- there are no block tokens there, so the RPC connection is getting reused properly. I'll file another JIRA about this issue. Concurrent readers hit ChecksumExceptions if following a writer to very end of file --- Key: HDFS-1057 URL: https://issues.apache.org/jira/browse/HDFS-1057 Project: Hadoop HDFS Issue Type: Sub-task Components: data-node Affects Versions: 0.20-append, 0.21.0, 0.22.0 Reporter: Todd Lipcon Assignee: sam rash Priority: Blocker Fix For: 0.20-append, 0.21.0, 0.22.0 Attachments: HDFS-1057-0.20-append.patch, conurrent-reader-patch-1.txt, conurrent-reader-patch-2.txt, conurrent-reader-patch-3.txt, hdfs-1057-trunk-1.txt, hdfs-1057-trunk-2.txt, hdfs-1057-trunk-3.txt, hdfs-1057-trunk-4.txt, hdfs-1057-trunk-5.txt, hdfs-1057-trunk-6.txt In BlockReceiver.receivePacket, it calls replicaInfo.setBytesOnDisk before calling flush(). Therefore, if there is a concurrent reader, it's possible to race here - the reader will see the new length while those bytes are still in the buffers of BlockReceiver. Thus the client will potentially see checksum errors or EOFs. Additionally, the last checksum chunk of the file is made accessible to readers even though it is not stable. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HDFS-1965) IPCs done using block token-based tickets can't reuse connections
IPCs done using block token-based tickets can't reuse connections - Key: HDFS-1965 URL: https://issues.apache.org/jira/browse/HDFS-1965 Project: Hadoop HDFS Issue Type: Bug Components: security Reporter: Todd Lipcon Assignee: Todd Lipcon Priority: Critical Fix For: 0.22.0 This is the reason that TestFileConcurrentReaders has been failing a lot. Reproducing a comment from HDFS-1057: The test has a thread which continually re-opens the file which is being written to. Since the file's in the middle of being written, it makes an RPC to the DataNode in order to determine the visible length of the file. This RPC is authenticated using the block token which came back in the LocatedBlocks object as the security ticket. When this RPC hits the IPC layer, it looks at its existing connections and sees none that can be re-used, since the block token differs between the two requesters. Hence, it reconnects, and we end up with hundreds or thousands of IPC connections to the datanode. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1057) Concurrent readers hit ChecksumExceptions if following a writer to very end of file
[ https://issues.apache.org/jira/browse/HDFS-1057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13036530#comment-13036530 ] Tsz Wo (Nicholas), SZE commented on HDFS-1057: -- Todd, well done! Thanks for investigating it. Concurrent readers hit ChecksumExceptions if following a writer to very end of file --- Key: HDFS-1057 URL: https://issues.apache.org/jira/browse/HDFS-1057 Project: Hadoop HDFS Issue Type: Sub-task Components: data-node Affects Versions: 0.20-append, 0.21.0, 0.22.0 Reporter: Todd Lipcon Assignee: sam rash Priority: Blocker Fix For: 0.20-append, 0.21.0, 0.22.0 Attachments: HDFS-1057-0.20-append.patch, conurrent-reader-patch-1.txt, conurrent-reader-patch-2.txt, conurrent-reader-patch-3.txt, hdfs-1057-trunk-1.txt, hdfs-1057-trunk-2.txt, hdfs-1057-trunk-3.txt, hdfs-1057-trunk-4.txt, hdfs-1057-trunk-5.txt, hdfs-1057-trunk-6.txt In BlockReceiver.receivePacket, it calls replicaInfo.setBytesOnDisk before calling flush(). Therefore, if there is a concurrent reader, it's possible to race here - the reader will see the new length while those bytes are still in the buffers of BlockReceiver. Thus the client will potentially see checksum errors or EOFs. Additionally, the last checksum chunk of the file is made accessible to readers even though it is not stable. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1965) IPCs done using block token-based tickets can't reuse connections
[ https://issues.apache.org/jira/browse/HDFS-1965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13036531#comment-13036531 ] Todd Lipcon commented on HDFS-1965: --- I can think of a couple possible solutions: a) make the methods that operate on a block take an additional parameter to contain block tokens, rather than using the normal token selector mechanism that scopes credentials on a per-connection basis. This has the advantage that we can even re-use an IPC connection across different blocks. b) when the client creates an IPC proxy to a DN, it can explicitly configure the maxIdleTime to 0 so that we don't leave connections hanging around after the call completes. This is less efficient than option A above, but it probably doesn't matter much for this use case. IPCs done using block token-based tickets can't reuse connections - Key: HDFS-1965 URL: https://issues.apache.org/jira/browse/HDFS-1965 Project: Hadoop HDFS Issue Type: Bug Components: security Reporter: Todd Lipcon Assignee: Todd Lipcon Priority: Critical Fix For: 0.22.0 This is the reason that TestFileConcurrentReaders has been failing a lot. Reproducing a comment from HDFS-1057: The test has a thread which continually re-opens the file which is being written to. Since the file's in the middle of being written, it makes an RPC to the DataNode in order to determine the visible length of the file. This RPC is authenticated using the block token which came back in the LocatedBlocks object as the security ticket. When this RPC hits the IPC layer, it looks at its existing connections and sees none that can be re-used, since the block token differs between the two requesters. Hence, it reconnects, and we end up with hundreds or thousands of IPC connections to the datanode. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HDFS-1966) Encapsulate individual DataTransferProtocol op header
Encapsulate individual DataTransferProtocol op header - Key: HDFS-1966 URL: https://issues.apache.org/jira/browse/HDFS-1966 Project: Hadoop HDFS Issue Type: Sub-task Components: data-node, hdfs client Reporter: Tsz Wo (Nicholas), SZE Assignee: Tsz Wo (Nicholas), SZE It will make a clear distinction between the variables used in the protocol and the others. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1371) One bad node can incorrectly flag many files as corrupt
[ https://issues.apache.org/jira/browse/HDFS-1371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13036542#comment-13036542 ] Jitendra Nath Pandey commented on HDFS-1371: I have committed this. Thanks to Tanping! One bad node can incorrectly flag many files as corrupt --- Key: HDFS-1371 URL: https://issues.apache.org/jira/browse/HDFS-1371 Project: Hadoop HDFS Issue Type: Bug Components: hdfs client, name-node Affects Versions: 0.20.1, 0.23.0 Environment: yahoo internal version [knoguchi@gwgd4003 ~]$ hadoop version Hadoop 0.20.104.3.1007030707 Reporter: Koji Noguchi Assignee: Tanping Wang Fix For: 0.23.0 Attachments: HDFS-1371.04252011.patch, HDFS-1371.0503.patch, HDFS-1371.0513.patch, HDFS-1371.0515.patch, HDFS-1371.0517.2.patch, HDFS-1371.0517.patch, HDFS-1371.0518.2.patch, HDFS-1371.0518.patch On our cluster, 12 files were reported as corrupt by fsck even though the replicas on the datanodes were healthy. Turns out that all the replicas (12 files x 3 replicas per file) were reported corrupt from one node. Surprisingly, these files were still readable/accessible from dfsclient (-get/-cat) without any problems. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-1371) One bad node can incorrectly flag many files as corrupt
[ https://issues.apache.org/jira/browse/HDFS-1371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jitendra Nath Pandey updated HDFS-1371: --- Resolution: Fixed Status: Resolved (was: Patch Available) One bad node can incorrectly flag many files as corrupt --- Key: HDFS-1371 URL: https://issues.apache.org/jira/browse/HDFS-1371 Project: Hadoop HDFS Issue Type: Bug Components: hdfs client, name-node Affects Versions: 0.20.1, 0.23.0 Environment: yahoo internal version [knoguchi@gwgd4003 ~]$ hadoop version Hadoop 0.20.104.3.1007030707 Reporter: Koji Noguchi Assignee: Tanping Wang Fix For: 0.23.0 Attachments: HDFS-1371.04252011.patch, HDFS-1371.0503.patch, HDFS-1371.0513.patch, HDFS-1371.0515.patch, HDFS-1371.0517.2.patch, HDFS-1371.0517.patch, HDFS-1371.0518.2.patch, HDFS-1371.0518.patch On our cluster, 12 files were reported as corrupt by fsck even though the replicas on the datanodes were healthy. Turns out that all the replicas (12 files x 3 replicas per file) were reported corrupt from one node. Surprisingly, these files were still readable/accessible from dfsclient (-get/-cat) without any problems. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-1966) Encapsulate individual DataTransferProtocol op header
[ https://issues.apache.org/jira/browse/HDFS-1966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo (Nicholas), SZE updated HDFS-1966: - Attachment: h1966_20110519.patch h1966_20110519.patch: added {{CopyBlockHeader}} for illustrating the idea. Encapsulate individual DataTransferProtocol op header - Key: HDFS-1966 URL: https://issues.apache.org/jira/browse/HDFS-1966 Project: Hadoop HDFS Issue Type: Sub-task Components: data-node, hdfs client Reporter: Tsz Wo (Nicholas), SZE Assignee: Tsz Wo (Nicholas), SZE Attachments: h1966_20110519.patch It will make a clear distinction between the variables used in the protocol and the others. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1877) Create a functional test for file read/write
[ https://issues.apache.org/jira/browse/HDFS-1877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13036546#comment-13036546 ] Hadoop QA commented on HDFS-1877: - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12479831/TestWriteRead.patch against trunk revision 1125057. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these core unit tests: org.apache.hadoop.hdfs.TestDFSStorageStateRecovery org.apache.hadoop.hdfs.TestFileConcurrentReader org.apache.hadoop.hdfs.TestHDFSTrash +1 contrib tests. The patch passed contrib unit tests. +1 system test framework. The patch passed system test framework compile. Test results: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/590//testReport/ Findbugs warnings: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/590//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/590//console This message is automatically generated. Create a functional test for file read/write Key: HDFS-1877 URL: https://issues.apache.org/jira/browse/HDFS-1877 Project: Hadoop HDFS Issue Type: Test Components: test Affects Versions: 0.22.0 Reporter: CW Chung Priority: Minor Attachments: TestWriteRead.java, TestWriteRead.patch, TestWriteRead.patch It would be a great to have a tool, running on a real grid, to perform function test (and stress tests to certain extent) for the file operations. The tool would be written in Java and makes HDFS API calls to read, write, append, hflush hadoop files. The tool would be usable standalone, or as a building block for other regression or stress test suites (written in shell, perl, python, etc). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1952) FSEditLog.open() appears to succeed even if all EDITS directories fail
[ https://issues.apache.org/jira/browse/HDFS-1952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13036550#comment-13036550 ] Matt Foley commented on HDFS-1952: -- Sorry I missed this the first time. It's minor, so you don't have to re-spin the patch just for this, but for future reference: Per the coding guidelines (http://wiki.apache.org/hadoop/HowToContribute#Making_Changes) please add { } after if statements, even single-line ones. Thanks. +1 pending Hudson test-patch results. FSEditLog.open() appears to succeed even if all EDITS directories fail -- Key: HDFS-1952 URL: https://issues.apache.org/jira/browse/HDFS-1952 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 0.22.0, 0.23.0 Reporter: Matt Foley Assignee: Andrew Wang Labels: newbie Attachments: hdfs-1952.patch, hdfs-1952.patch, hdfs-1952.patch FSEditLog.open() appears to succeed even if all of the individual directories failed to allow creation of an EditLogOutputStream. The problem and solution are essentially similar to that of HDFS-1505. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1877) Create a functional test for file read/write
[ https://issues.apache.org/jira/browse/HDFS-1877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13036552#comment-13036552 ] Tsz Wo (Nicholas), SZE commented on HDFS-1877: -- CW, please grant license to ASF for your latest patch. Create a functional test for file read/write Key: HDFS-1877 URL: https://issues.apache.org/jira/browse/HDFS-1877 Project: Hadoop HDFS Issue Type: Test Components: test Reporter: CW Chung Priority: Minor Attachments: TestWriteRead.java, TestWriteRead.patch, TestWriteRead.patch It would be a great to have a tool, running on a real grid, to perform function test (and stress tests to certain extent) for the file operations. The tool would be written in Java and makes HDFS API calls to read, write, append, hflush hadoop files. The tool would be usable standalone, or as a building block for other regression or stress test suites (written in shell, perl, python, etc). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-1877) Create a functional test for file read/write
[ https://issues.apache.org/jira/browse/HDFS-1877?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo (Nicholas), SZE updated HDFS-1877: - Affects Version/s: (was: 0.22.0) Assignee: CW Chung Create a functional test for file read/write Key: HDFS-1877 URL: https://issues.apache.org/jira/browse/HDFS-1877 Project: Hadoop HDFS Issue Type: Test Components: test Reporter: CW Chung Assignee: CW Chung Priority: Minor Attachments: TestWriteRead.java, TestWriteRead.patch, TestWriteRead.patch It would be a great to have a tool, running on a real grid, to perform function test (and stress tests to certain extent) for the file operations. The tool would be written in Java and makes HDFS API calls to read, write, append, hflush hadoop files. The tool would be usable standalone, or as a building block for other regression or stress test suites (written in shell, perl, python, etc). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1965) IPCs done using block token-based tickets can't reuse connections
[ https://issues.apache.org/jira/browse/HDFS-1965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13036554#comment-13036554 ] Todd Lipcon commented on HDFS-1965: --- I implemented option (b) and have a test case that shows that it fixes the problem... BUT: the real DFSInputStream code seems to call RPC.stopProxy() after it uses the proxy, which should also avoid this issue. Doing so in my test case makes the case pass without any other fix. So there's still some mystery. IPCs done using block token-based tickets can't reuse connections - Key: HDFS-1965 URL: https://issues.apache.org/jira/browse/HDFS-1965 Project: Hadoop HDFS Issue Type: Bug Components: security Reporter: Todd Lipcon Assignee: Todd Lipcon Priority: Critical Fix For: 0.22.0 This is the reason that TestFileConcurrentReaders has been failing a lot. Reproducing a comment from HDFS-1057: The test has a thread which continually re-opens the file which is being written to. Since the file's in the middle of being written, it makes an RPC to the DataNode in order to determine the visible length of the file. This RPC is authenticated using the block token which came back in the LocatedBlocks object as the security ticket. When this RPC hits the IPC layer, it looks at its existing connections and sees none that can be re-used, since the block token differs between the two requesters. Hence, it reconnects, and we end up with hundreds or thousands of IPC connections to the datanode. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1921) Save namespace can cause NN to be unable to come up on restart
[ https://issues.apache.org/jira/browse/HDFS-1921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13036561#comment-13036561 ] Aaron T. Myers commented on HDFS-1921: -- Sure, Matt. Here's the output from test-patch on branch-0.22: {noformat} +1 overall. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 system test framework. The patch passed system test framework compile. {noformat} Save namespace can cause NN to be unable to come up on restart -- Key: HDFS-1921 URL: https://issues.apache.org/jira/browse/HDFS-1921 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 0.22.0, 0.23.0 Reporter: Aaron T. Myers Assignee: Matt Foley Priority: Blocker Fix For: 0.22.0, 0.23.0 Attachments: hdfs-1505-1-test.txt, hdfs-1921-2.patch, hdfs-1921-2_v22.patch, hdfs-1921.txt, hdfs1921_v23.patch, hdfs1921_v23.patch I discovered this in the course of trying to implement a fix for HDFS-1505. Per the comment for {{FSImage.saveNamespace(...)}}, the algorithm for save namespace proceeds in the following order: # rename current to lastcheckpoint.tmp for all of them, # save image and recreate edits for all of them, # rename lastcheckpoint.tmp to previous.checkpoint. The problem is that step 3 occurs regardless of whether or not an error occurs for all storage directories in step 2. Upon restart, the NN will see non-existent or corrupt {{current}} directories, and no {{lastcheckpoint.tmp}} directories, and so will conclude that the storage directories are not formatted. This issue appears to be present on both 0.22 and 0.23. This should arguably be a 0.22/0.23 blocker. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1964) Incorrect HTML unescaping in DatanodeJspHelper.java
[ https://issues.apache.org/jira/browse/HDFS-1964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13036576#comment-13036576 ] Eli Collins commented on HDFS-1964: --- +1 lgtm Incorrect HTML unescaping in DatanodeJspHelper.java --- Key: HDFS-1964 URL: https://issues.apache.org/jira/browse/HDFS-1964 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 0.22.0, 0.23.0 Reporter: Aaron T. Myers Assignee: Aaron T. Myers Fix For: 0.22.0, 0.23.0 Attachments: hdfs-1964-trunk.0.patch HDFS-1575 introduced some HTML unescaping of parameters so that viewing a file would work for paths containing HTML-escaped characters, but in two of the places did the unescaping either too early or too late. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-1965) IPCs done using block token-based tickets can't reuse connections
[ https://issues.apache.org/jira/browse/HDFS-1965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Todd Lipcon updated HDFS-1965: -- Status: Patch Available (was: Open) IPCs done using block token-based tickets can't reuse connections - Key: HDFS-1965 URL: https://issues.apache.org/jira/browse/HDFS-1965 Project: Hadoop HDFS Issue Type: Bug Components: security Reporter: Todd Lipcon Assignee: Todd Lipcon Priority: Critical Fix For: 0.22.0 Attachments: hdfs-1965.txt This is the reason that TestFileConcurrentReaders has been failing a lot. Reproducing a comment from HDFS-1057: The test has a thread which continually re-opens the file which is being written to. Since the file's in the middle of being written, it makes an RPC to the DataNode in order to determine the visible length of the file. This RPC is authenticated using the block token which came back in the LocatedBlocks object as the security ticket. When this RPC hits the IPC layer, it looks at its existing connections and sees none that can be re-used, since the block token differs between the two requesters. Hence, it reconnects, and we end up with hundreds or thousands of IPC connections to the datanode. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-1965) IPCs done using block token-based tickets can't reuse connections
[ https://issues.apache.org/jira/browse/HDFS-1965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Todd Lipcon updated HDFS-1965: -- Attachment: hdfs-1965.txt Turns out the reason that RPC.stopProxy isn't effective in real life is that the WritableRpcEngine Client objects are cached in ClientCache with keys that aren't tied to principals. So, stopProxy doesn't actually cause the connection to disconnect. I'm not sure if that's a bug or by design. This patch now includes a regression test that simulates DFSClient closely. IPCs done using block token-based tickets can't reuse connections - Key: HDFS-1965 URL: https://issues.apache.org/jira/browse/HDFS-1965 Project: Hadoop HDFS Issue Type: Bug Components: security Reporter: Todd Lipcon Assignee: Todd Lipcon Priority: Critical Fix For: 0.22.0 Attachments: hdfs-1965.txt This is the reason that TestFileConcurrentReaders has been failing a lot. Reproducing a comment from HDFS-1057: The test has a thread which continually re-opens the file which is being written to. Since the file's in the middle of being written, it makes an RPC to the DataNode in order to determine the visible length of the file. This RPC is authenticated using the block token which came back in the LocatedBlocks object as the security ticket. When this RPC hits the IPC layer, it looks at its existing connections and sees none that can be re-used, since the block token differs between the two requesters. Hence, it reconnects, and we end up with hundreds or thousands of IPC connections to the datanode. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-1602) NameNode storage failed replica restoration is broken
[ https://issues.apache.org/jira/browse/HDFS-1602?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eli Collins updated HDFS-1602: -- Summary: NameNode storage failed replica restoration is broken (was: Fix HADOOP-4885 for it is doesn't work as expected.) NameNode storage failed replica restoration is broken - Key: HDFS-1602 URL: https://issues.apache.org/jira/browse/HDFS-1602 Project: Hadoop HDFS Issue Type: Bug Components: name-node Affects Versions: 0.21.0, 0.23.0 Reporter: Konstantin Boudnik Assignee: Boris Shkolnik Fix For: 0.22.0 Attachments: HDFS-1602-1.patch, HDFS-1602.patch, HDFS-1602v22.patch NameNode storage restore functionality doesn't work (as HDFS-903 demonstrated). This needs to be either disabled, or removed, or fixed. This feature also fails HDFS-1496 -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HDFS-1967) TestHDFSTrash failing on trunk and 22
TestHDFSTrash failing on trunk and 22 - Key: HDFS-1967 URL: https://issues.apache.org/jira/browse/HDFS-1967 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 0.22.0 Reporter: Todd Lipcon Fix For: 0.22.0 Seems to have started failing recently in many commit builds as well as the last two nightly builds of 22: https://builds.apache.org/hudson/job/Hadoop-Hdfs-22-branch/51/testReport/org.apache.hadoop.hdfs/TestHDFSTrash/testTrashEmptier/ -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1963) HDFS rpm integration project
[ https://issues.apache.org/jira/browse/HDFS-1963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13036586#comment-13036586 ] Hadoop QA commented on HDFS-1963: - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12479849/HDFS-1963-1.patch against trunk revision 1125145. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 14 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. -1 release audit. The applied patch generated 2 release audit warnings (more than the trunk's current 0 warnings). -1 core tests. The patch failed these core unit tests: org.apache.hadoop.hdfs.TestDFSStorageStateRecovery org.apache.hadoop.hdfs.TestFileConcurrentReader org.apache.hadoop.hdfs.TestHDFSTrash +1 contrib tests. The patch passed contrib unit tests. +1 system test framework. The patch passed system test framework compile. Test results: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/595//testReport/ Release audit warnings: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/595//artifact/trunk/patchprocess/patchReleaseAuditProblems.txt Findbugs warnings: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/595//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/595//console This message is automatically generated. HDFS rpm integration project Key: HDFS-1963 URL: https://issues.apache.org/jira/browse/HDFS-1963 Project: Hadoop HDFS Issue Type: New Feature Components: build Environment: Java 6, RHEL 5.5 Reporter: Eric Yang Assignee: Eric Yang Attachments: HDFS-1963-1.patch, HDFS-1963.patch This jira is corresponding to HADOOP-6255 and associated directory layout change. The patch for creating HDFS rpm packaging should be posted here for patch test build to verify against hdfs svn trunk. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-1877) Create a functional test for file read/write
[ https://issues.apache.org/jira/browse/HDFS-1877?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] CW Chung updated HDFS-1877: --- Attachment: TestWriteRead.patch Grant license to Apache. Otherwise, this version is the same as the one submitted 2 hours ago. Create a functional test for file read/write Key: HDFS-1877 URL: https://issues.apache.org/jira/browse/HDFS-1877 Project: Hadoop HDFS Issue Type: Test Components: test Reporter: CW Chung Assignee: CW Chung Priority: Minor Attachments: TestWriteRead.java, TestWriteRead.patch, TestWriteRead.patch, TestWriteRead.patch It would be a great to have a tool, running on a real grid, to perform function test (and stress tests to certain extent) for the file operations. The tool would be written in Java and makes HDFS API calls to read, write, append, hflush hadoop files. The tool would be usable standalone, or as a building block for other regression or stress test suites (written in shell, perl, python, etc). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira