[jira] [Created] (HDFS-3165) HDFS Balancer scripts are broken
HDFS Balancer scripts are broken Key: HDFS-3165 URL: https://issues.apache.org/jira/browse/HDFS-3165 Project: Hadoop HDFS Issue Type: Bug Components: balancer Affects Versions: 1.0.1 Environment: HDFS Balancer Reporter: amith Priority: Minor HDFS Balancer scripts are broken -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3165) HDFS Balancer scripts are refering to wrong path of hadoop-daemon.sh
[ https://issues.apache.org/jira/browse/HDFS-3165?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] amith updated HDFS-3165: Description: HDFS Balancer scripts are refering to wrong path of hadoop-daemon.sh HOST-10-18-40-95:/home/amith/V1R2/namenode1/sbin # ./start-balancer.sh ./start-balancer.sh: line 27: /home/amith/V1R2/namenode1/bin/hadoop-daemon.sh: No such file or directory currently hadoop-daemon.sh script is in sbin and not bin. was:HDFS Balancer scripts are broken Summary: HDFS Balancer scripts are refering to wrong path of hadoop-daemon.sh (was: HDFS Balancer scripts are broken) HDFS Balancer scripts are refering to wrong path of hadoop-daemon.sh Key: HDFS-3165 URL: https://issues.apache.org/jira/browse/HDFS-3165 Project: Hadoop HDFS Issue Type: Bug Components: balancer Affects Versions: 1.0.1 Environment: HDFS Balancer Reporter: amith Priority: Minor Original Estimate: 1m Remaining Estimate: 1m HDFS Balancer scripts are refering to wrong path of hadoop-daemon.sh HOST-10-18-40-95:/home/amith/V1R2/namenode1/sbin # ./start-balancer.sh ./start-balancer.sh: line 27: /home/amith/V1R2/namenode1/bin/hadoop-daemon.sh: No such file or directory currently hadoop-daemon.sh script is in sbin and not bin. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3165) HDFS Balancer scripts are refering to wrong path of hadoop-daemon.sh
[ https://issues.apache.org/jira/browse/HDFS-3165?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] amith updated HDFS-3165: Description: HDFS Balancer scripts are refering to wrong path of hadoop-daemon.sh HOST-xx-xx-xx-xx:/home/amith/V1R2/namenode1/sbin # ./start-balancer.sh ./start-balancer.sh: line 27: /home/amith/V1R2/namenode1/bin/hadoop-daemon.sh: No such file or directory currently hadoop-daemon.sh script is in sbin and not bin. was: HDFS Balancer scripts are refering to wrong path of hadoop-daemon.sh HOST-10-18-40-95:/home/amith/V1R2/namenode1/sbin # ./start-balancer.sh ./start-balancer.sh: line 27: /home/amith/V1R2/namenode1/bin/hadoop-daemon.sh: No such file or directory currently hadoop-daemon.sh script is in sbin and not bin. HDFS Balancer scripts are refering to wrong path of hadoop-daemon.sh Key: HDFS-3165 URL: https://issues.apache.org/jira/browse/HDFS-3165 Project: Hadoop HDFS Issue Type: Bug Components: balancer Affects Versions: 1.0.1 Environment: HDFS Balancer Reporter: amith Priority: Minor Original Estimate: 1m Remaining Estimate: 1m HDFS Balancer scripts are refering to wrong path of hadoop-daemon.sh HOST-xx-xx-xx-xx:/home/amith/V1R2/namenode1/sbin # ./start-balancer.sh ./start-balancer.sh: line 27: /home/amith/V1R2/namenode1/bin/hadoop-daemon.sh: No such file or directory currently hadoop-daemon.sh script is in sbin and not bin. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3165) HDFS Balancer scripts are refering to wrong path of hadoop-daemon.sh
[ https://issues.apache.org/jira/browse/HDFS-3165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13242176#comment-13242176 ] Uma Maheswara Rao G commented on HDFS-3165: --- Thanks for testing balancer. Good finding. Are you planning to put the patch for this issue? HDFS Balancer scripts are refering to wrong path of hadoop-daemon.sh Key: HDFS-3165 URL: https://issues.apache.org/jira/browse/HDFS-3165 Project: Hadoop HDFS Issue Type: Bug Components: balancer Affects Versions: 1.0.1 Environment: HDFS Balancer Reporter: amith Priority: Minor Original Estimate: 1m Remaining Estimate: 1m HDFS Balancer scripts are refering to wrong path of hadoop-daemon.sh HOST-xx-xx-xx-xx:/home/amith/V1R2/namenode1/sbin # ./start-balancer.sh ./start-balancer.sh: line 27: /home/amith/V1R2/namenode1/bin/hadoop-daemon.sh: No such file or directory currently hadoop-daemon.sh script is in sbin and not bin. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3165) HDFS Balancer scripts are refering to wrong path of hadoop-daemon.sh
[ https://issues.apache.org/jira/browse/HDFS-3165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13242207#comment-13242207 ] amith commented on HDFS-3165: - Yes I will provide eod HDFS Balancer scripts are refering to wrong path of hadoop-daemon.sh Key: HDFS-3165 URL: https://issues.apache.org/jira/browse/HDFS-3165 Project: Hadoop HDFS Issue Type: Bug Components: balancer Affects Versions: 1.0.1 Environment: HDFS Balancer Reporter: amith Priority: Minor Original Estimate: 1m Remaining Estimate: 1m HDFS Balancer scripts are refering to wrong path of hadoop-daemon.sh HOST-xx-xx-xx-xx:/home/amith/V1R2/namenode1/sbin # ./start-balancer.sh ./start-balancer.sh: line 27: /home/amith/V1R2/namenode1/bin/hadoop-daemon.sh: No such file or directory currently hadoop-daemon.sh script is in sbin and not bin. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3066) cap space usage of default log4j rolling policy (hdfs specific changes)
[ https://issues.apache.org/jira/browse/HDFS-3066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13242315#comment-13242315 ] Hudson commented on HDFS-3066: -- Integrated in Hadoop-Hdfs-trunk #1000 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1000/]) HDFS-3066. Cap space usage of default log4j rolling policy. Contributed by Patrick Hunt (Revision 1307100) Result = FAILURE eli : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1307100 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/bin/hdfs cap space usage of default log4j rolling policy (hdfs specific changes) --- Key: HDFS-3066 URL: https://issues.apache.org/jira/browse/HDFS-3066 Project: Hadoop HDFS Issue Type: Improvement Components: scripts Reporter: Patrick Hunt Assignee: Patrick Hunt Fix For: 2.0.0 Attachments: HDFS-3066.patch see HADOOP-8149 for background on this. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3137) Bump LAST_UPGRADABLE_LAYOUT_VERSION to -16
[ https://issues.apache.org/jira/browse/HDFS-3137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13242316#comment-13242316 ] Hudson commented on HDFS-3137: -- Integrated in Hadoop-Hdfs-trunk #1000 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1000/]) Move HDFS-3137 to the right place in CHANGES.txt (Revision 1307174) HDFS-3137. Bump LAST_UPGRADABLE_LAYOUT_VERSION to -16. Contributed by Eli Collins (Revision 1307173) Result = FAILURE eli : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1307174 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt eli : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1307173 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeDescriptor.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/common/Storage.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataStorage.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSEditLogLoader.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSEditLogOp.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSEditLogOpCodes.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSImageFormat.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSImageSerialization.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSUpgradeFromImage.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/common/TestDistributedUpgrade.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/tools/offlineEditsViewer/TestOfflineEditsViewer.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/resources/hadoop-14-dfs-dir.tgz * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/resources/hadoop-dfs-dir.txt Bump LAST_UPGRADABLE_LAYOUT_VERSION to -16 -- Key: HDFS-3137 URL: https://issues.apache.org/jira/browse/HDFS-3137 Project: Hadoop HDFS Issue Type: Improvement Components: name-node Reporter: Eli Collins Assignee: Eli Collins Fix For: 2.0.0 Attachments: hdfs-3137.txt, hdfs-3137.txt, hdfs-3137.txt LAST_UPGRADABLE_LAYOUT_VERSION is currently -7, which corresponds to Hadoop 0.14. How about we bump it to -16, which corresponds to Hadoop 0.18? I don't think many people are using releases older than v0.18, and those who are probably want to upgrade to the latest stable release (v1.0). To upgrade to eg 0.23 they can still upgrade to v1.0 first and then upgrade again from there. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3142) TestHDFSCLI.testAll is failing
[ https://issues.apache.org/jira/browse/HDFS-3142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13242318#comment-13242318 ] Hudson commented on HDFS-3142: -- Integrated in Hadoop-Hdfs-trunk #1000 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1000/]) HDFS-3142. TestHDFSCLI.testAll is failing. Contributed by Brandon Li. (Revision 1307134) Result = FAILURE atm : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1307134 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/resources/testHDFSConf.xml TestHDFSCLI.testAll is failing -- Key: HDFS-3142 URL: https://issues.apache.org/jira/browse/HDFS-3142 Project: Hadoop HDFS Issue Type: Bug Components: test Affects Versions: 2.0.0 Reporter: Eli Collins Assignee: Brandon Li Priority: Blocker Fix For: 2.0.0 Attachments: HDFS-3142.patch TestHDFSCLI.testAll is failing in the latest trunk/23 builds. Last good build was Mar 23rd. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1599) Umbrella Jira for Improving HBASE support in HDFS
[ https://issues.apache.org/jira/browse/HDFS-1599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13242322#comment-13242322 ] Uma Maheswara Rao G commented on HDFS-1599: --- Other point I wanted to mention is, Presently HBase using lot of reflection based invocations to call HDFS methods( which are not exposed). Ex: {code}Field fIn = FilterInputStream.class.getDeclaredField(in); fIn.setAccessible(true); Object realIn = fIn.get(this.in); // In hadoop 0.22, DFSInputStream is a standalone class. Before this, // it was an inner class of DFSClient. if (realIn.getClass().getName().endsWith(DFSInputStream)) { Method getFileLength = realIn.getClass(). getDeclaredMethod(getFileLength, new Class? []{}); getFileLength.setAccessible(true); long realLength = ((Long)getFileLength. invoke(realIn, new Object []{})).longValue();{code} What is plan for support in exposing some kind of real usage details to dependant components through some special interfaces? I can see that, lot of code filled with reflection based invocations in Hbase. That will really make harder in version migrations later. Since they are internal APIs HDFS may change very easily. But Hbase might depend on them tightly. In such cases, Hbase will face lot of difficultuies in migrating to newer versions. Let's start brainstorming on this issue here. Umbrella Jira for Improving HBASE support in HDFS - Key: HDFS-1599 URL: https://issues.apache.org/jira/browse/HDFS-1599 Project: Hadoop HDFS Issue Type: Improvement Reporter: Sanjay Radia Umbrella Jira for improved HBase support in HDFS -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3165) HDFS Balancer scripts are refering to wrong path of hadoop-daemon.sh
[ https://issues.apache.org/jira/browse/HDFS-3165?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] amith updated HDFS-3165: Status: Patch Available (was: Open) HDFS Balancer scripts are refering to wrong path of hadoop-daemon.sh Key: HDFS-3165 URL: https://issues.apache.org/jira/browse/HDFS-3165 Project: Hadoop HDFS Issue Type: Bug Components: balancer Affects Versions: 1.0.1 Environment: HDFS Balancer Reporter: amith Priority: Minor Original Estimate: 1m Remaining Estimate: 1m HDFS Balancer scripts are refering to wrong path of hadoop-daemon.sh HOST-xx-xx-xx-xx:/home/amith/V1R2/namenode1/sbin # ./start-balancer.sh ./start-balancer.sh: line 27: /home/amith/V1R2/namenode1/bin/hadoop-daemon.sh: No such file or directory currently hadoop-daemon.sh script is in sbin and not bin. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3165) HDFS Balancer scripts are refering to wrong path of hadoop-daemon.sh
[ https://issues.apache.org/jira/browse/HDFS-3165?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] amith updated HDFS-3165: Status: Open (was: Patch Available) HDFS Balancer scripts are refering to wrong path of hadoop-daemon.sh Key: HDFS-3165 URL: https://issues.apache.org/jira/browse/HDFS-3165 Project: Hadoop HDFS Issue Type: Bug Components: balancer Affects Versions: 1.0.1 Environment: HDFS Balancer Reporter: amith Priority: Minor Original Estimate: 1m Remaining Estimate: 1m HDFS Balancer scripts are refering to wrong path of hadoop-daemon.sh HOST-xx-xx-xx-xx:/home/amith/V1R2/namenode1/sbin # ./start-balancer.sh ./start-balancer.sh: line 27: /home/amith/V1R2/namenode1/bin/hadoop-daemon.sh: No such file or directory currently hadoop-daemon.sh script is in sbin and not bin. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3165) HDFS Balancer scripts are refering to wrong path of hadoop-daemon.sh
[ https://issues.apache.org/jira/browse/HDFS-3165?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] amith updated HDFS-3165: Attachment: HDFS-3165.patch Hi uma here is a patch for the bug can u review HDFS Balancer scripts are refering to wrong path of hadoop-daemon.sh Key: HDFS-3165 URL: https://issues.apache.org/jira/browse/HDFS-3165 Project: Hadoop HDFS Issue Type: Bug Components: balancer Affects Versions: 1.0.1 Environment: HDFS Balancer Reporter: amith Priority: Minor Attachments: HDFS-3165.patch Original Estimate: 1m Remaining Estimate: 1m HDFS Balancer scripts are refering to wrong path of hadoop-daemon.sh HOST-xx-xx-xx-xx:/home/amith/V1R2/namenode1/sbin # ./start-balancer.sh ./start-balancer.sh: line 27: /home/amith/V1R2/namenode1/bin/hadoop-daemon.sh: No such file or directory currently hadoop-daemon.sh script is in sbin and not bin. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3165) HDFS Balancer scripts are refering to wrong path of hadoop-daemon.sh
[ https://issues.apache.org/jira/browse/HDFS-3165?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] amith updated HDFS-3165: Status: Patch Available (was: Open) HDFS Balancer scripts are refering to wrong path of hadoop-daemon.sh Key: HDFS-3165 URL: https://issues.apache.org/jira/browse/HDFS-3165 Project: Hadoop HDFS Issue Type: Bug Components: balancer Affects Versions: 1.0.1 Environment: HDFS Balancer Reporter: amith Priority: Minor Attachments: HDFS-3165.patch Original Estimate: 1m Remaining Estimate: 1m HDFS Balancer scripts are refering to wrong path of hadoop-daemon.sh HOST-xx-xx-xx-xx:/home/amith/V1R2/namenode1/sbin # ./start-balancer.sh ./start-balancer.sh: line 27: /home/amith/V1R2/namenode1/bin/hadoop-daemon.sh: No such file or directory currently hadoop-daemon.sh script is in sbin and not bin. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3165) HDFS Balancer scripts are refering to wrong path of hadoop-daemon.sh
[ https://issues.apache.org/jira/browse/HDFS-3165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13242327#comment-13242327 ] Hadoop QA commented on HDFS-3165: - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12520605/HDFS-3165.patch against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2126//console This message is automatically generated. HDFS Balancer scripts are refering to wrong path of hadoop-daemon.sh Key: HDFS-3165 URL: https://issues.apache.org/jira/browse/HDFS-3165 Project: Hadoop HDFS Issue Type: Bug Components: balancer Affects Versions: 1.0.1 Environment: HDFS Balancer Reporter: amith Priority: Minor Attachments: HDFS-3165.patch Original Estimate: 1m Remaining Estimate: 1m HDFS Balancer scripts are refering to wrong path of hadoop-daemon.sh HOST-xx-xx-xx-xx:/home/amith/V1R2/namenode1/sbin # ./start-balancer.sh ./start-balancer.sh: line 27: /home/amith/V1R2/namenode1/bin/hadoop-daemon.sh: No such file or directory currently hadoop-daemon.sh script is in sbin and not bin. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3066) cap space usage of default log4j rolling policy (hdfs specific changes)
[ https://issues.apache.org/jira/browse/HDFS-3066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13242346#comment-13242346 ] Hudson commented on HDFS-3066: -- Integrated in Hadoop-Mapreduce-trunk #1035 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1035/]) HDFS-3066. Cap space usage of default log4j rolling policy. Contributed by Patrick Hunt (Revision 1307100) Result = SUCCESS eli : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1307100 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/bin/hdfs cap space usage of default log4j rolling policy (hdfs specific changes) --- Key: HDFS-3066 URL: https://issues.apache.org/jira/browse/HDFS-3066 Project: Hadoop HDFS Issue Type: Improvement Components: scripts Reporter: Patrick Hunt Assignee: Patrick Hunt Fix For: 2.0.0 Attachments: HDFS-3066.patch see HADOOP-8149 for background on this. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3155) Clean up FSDataset implemenation related code.
[ https://issues.apache.org/jira/browse/HDFS-3155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13242355#comment-13242355 ] Hudson commented on HDFS-3155: -- Integrated in Hadoop-Mapreduce-trunk #1035 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1035/]) HDFS-3155. Clean up FSDataset implemenation related code. (Revision 1306582) Result = SUCCESS szetszwo : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1306582 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BlockReceiver.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BlockSender.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataStorage.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DatanodeUtil.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/FSDataset.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/ReplicaUnderRecovery.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/MiniDFSCluster.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestLeaseRecovery2.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestPipelines.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/DataNodeAdapter.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/DataNodeTestUtils.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/SimulatedFSDataset.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestBlockReport.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/ha/HAStressTestHarness.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/ha/HATestUtil.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/ha/TestDNFencing.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/ha/TestPipelinesFailover.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/ha/TestStandbyIsHot.java Clean up FSDataset implemenation related code. -- Key: HDFS-3155 URL: https://issues.apache.org/jira/browse/HDFS-3155 Project: Hadoop HDFS Issue Type: Sub-task Components: data-node Reporter: Tsz Wo (Nicholas), SZE Assignee: Tsz Wo (Nicholas), SZE Fix For: 2.0.0 Attachments: h3155_20120327.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3143) TestGetBlocks.testGetBlocks is failing
[ https://issues.apache.org/jira/browse/HDFS-3143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13242350#comment-13242350 ] Hudson commented on HDFS-3143: -- Integrated in Hadoop-Mapreduce-trunk #1035 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1035/]) HDFS-3143. TestGetBlocks.testGetBlocks is failing. Contributed by Arpit Gupta. (Revision 1306542) Result = SUCCESS atm : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1306542 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestGetBlocks.java TestGetBlocks.testGetBlocks is failing -- Key: HDFS-3143 URL: https://issues.apache.org/jira/browse/HDFS-3143 Project: Hadoop HDFS Issue Type: Bug Components: test Affects Versions: 2.0.0 Reporter: Eli Collins Assignee: Arpit Gupta Fix For: 2.0.0 Attachments: HDFS-3143.patch TestGetBlocks.testGetBlocks is failing in the latest trunk/23 builds. Last good build was Mar 23rd. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3137) Bump LAST_UPGRADABLE_LAYOUT_VERSION to -16
[ https://issues.apache.org/jira/browse/HDFS-3137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13242348#comment-13242348 ] Hudson commented on HDFS-3137: -- Integrated in Hadoop-Mapreduce-trunk #1035 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1035/]) Move HDFS-3137 to the right place in CHANGES.txt (Revision 1307174) HDFS-3137. Bump LAST_UPGRADABLE_LAYOUT_VERSION to -16. Contributed by Eli Collins (Revision 1307173) Result = SUCCESS eli : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1307174 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt eli : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1307173 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeDescriptor.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/common/Storage.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataStorage.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSEditLogLoader.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSEditLogOp.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSEditLogOpCodes.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSImageFormat.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSImageSerialization.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSUpgradeFromImage.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/common/TestDistributedUpgrade.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/tools/offlineEditsViewer/TestOfflineEditsViewer.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/resources/hadoop-14-dfs-dir.tgz * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/resources/hadoop-dfs-dir.txt Bump LAST_UPGRADABLE_LAYOUT_VERSION to -16 -- Key: HDFS-3137 URL: https://issues.apache.org/jira/browse/HDFS-3137 Project: Hadoop HDFS Issue Type: Improvement Components: name-node Reporter: Eli Collins Assignee: Eli Collins Fix For: 2.0.0 Attachments: hdfs-3137.txt, hdfs-3137.txt, hdfs-3137.txt LAST_UPGRADABLE_LAYOUT_VERSION is currently -7, which corresponds to Hadoop 0.14. How about we bump it to -16, which corresponds to Hadoop 0.18? I don't think many people are using releases older than v0.18, and those who are probably want to upgrade to the latest stable release (v1.0). To upgrade to eg 0.23 they can still upgrade to v1.0 first and then upgrade again from there. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3156) TestDFSHAAdmin is failing post HADOOP-8202
[ https://issues.apache.org/jira/browse/HDFS-3156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13242356#comment-13242356 ] Hudson commented on HDFS-3156: -- Integrated in Hadoop-Mapreduce-trunk #1035 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1035/]) HDFS-3156. TestDFSHAAdmin is failing post HADOOP-8202. Contributed by Aaron T. Myers. (Revision 1306517) Result = SUCCESS atm : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1306517 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/tools/TestDFSHAAdmin.java TestDFSHAAdmin is failing post HADOOP-8202 -- Key: HDFS-3156 URL: https://issues.apache.org/jira/browse/HDFS-3156 Project: Hadoop HDFS Issue Type: Bug Components: test Affects Versions: 2.0.0 Reporter: Aaron T. Myers Assignee: Aaron T. Myers Fix For: 2.0.0 Attachments: HDFS-3156.patch TestDFSHAAdmin mocks a protocol object without implementing Closeable, which is now required. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3160) httpfs should exec catalina instead of forking it
[ https://issues.apache.org/jira/browse/HDFS-3160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13242349#comment-13242349 ] Hudson commented on HDFS-3160: -- Integrated in Hadoop-Mapreduce-trunk #1035 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1035/]) HDFS-3160. httpfs should exec catalina instead of forking it. Contributed by Roman Shaposhnik (Revision 1306665) Result = SUCCESS eli : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1306665 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/src/main/sbin/httpfs.sh * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt httpfs should exec catalina instead of forking it - Key: HDFS-3160 URL: https://issues.apache.org/jira/browse/HDFS-3160 Project: Hadoop HDFS Issue Type: Bug Components: scripts Affects Versions: 0.23.1 Reporter: Roman Shaposhnik Assignee: Roman Shaposhnik Fix For: 2.0.0 Attachments: HDFS-3160.patch.txt In Bigtop we would like to start supporting constant monitoring of the running daemons (BIGTOP-263). It would be nice if Oozie can support that requirement by execing Catalina instead of forking it off. Currently we have to track down the actual process being monitored through the script that still hangs around. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3142) TestHDFSCLI.testAll is failing
[ https://issues.apache.org/jira/browse/HDFS-3142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13242351#comment-13242351 ] Hudson commented on HDFS-3142: -- Integrated in Hadoop-Mapreduce-trunk #1035 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1035/]) HDFS-3142. TestHDFSCLI.testAll is failing. Contributed by Brandon Li. (Revision 1307134) Result = SUCCESS atm : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1307134 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/resources/testHDFSConf.xml TestHDFSCLI.testAll is failing -- Key: HDFS-3142 URL: https://issues.apache.org/jira/browse/HDFS-3142 Project: Hadoop HDFS Issue Type: Bug Components: test Affects Versions: 2.0.0 Reporter: Eli Collins Assignee: Brandon Li Priority: Blocker Fix For: 2.0.0 Attachments: HDFS-3142.patch TestHDFSCLI.testAll is failing in the latest trunk/23 builds. Last good build was Mar 23rd. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3158) LiveNodes member of NameNodeMXBean should list non-DFS used space and capacity per DN
[ https://issues.apache.org/jira/browse/HDFS-3158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13242353#comment-13242353 ] Hudson commented on HDFS-3158: -- Integrated in Hadoop-Mapreduce-trunk #1035 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1035/]) HDFS-3158. LiveNodes member of NameNodeMXBean should list non-DFS used space and capacity per DN. Contributed by Aaron T. Myers. (Revision 1306635) Result = SUCCESS atm : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1306635 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestNameNodeMXBean.java LiveNodes member of NameNodeMXBean should list non-DFS used space and capacity per DN - Key: HDFS-3158 URL: https://issues.apache.org/jira/browse/HDFS-3158 Project: Hadoop HDFS Issue Type: Improvement Components: name-node Affects Versions: 2.0.0 Reporter: Aaron T. Myers Assignee: Aaron T. Myers Fix For: 2.0.0 Attachments: HDFS-3158.patch The LiveNodes section already lists the DFS used space per DN. It would be nice if it also listed the non-DFS used space and the capacity per DN. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-200) In HDFS, sync() not yet guarantees data available to the new readers
[ https://issues.apache.org/jira/browse/HDFS-200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13242382#comment-13242382 ] Uma Maheswara Rao G commented on HDFS-200: -- Hi Dhruba, Looks following code creating the problem in one special condition. {code} BlockInfo storedBlock = blocksMap.getStoredBlock(block); +if (storedBlock == null) { + // if the block with a WILDCARD generation stamp matches and the + // corresponding file is under construction, then accept this block. + // This block has a diferent generation stamp on the datanode + // because of a lease-recovery-attempt. + Block nblk = new Block(block.getBlockId()); + storedBlock = blocksMap.getStoredBlock(nblk); + if (storedBlock != null storedBlock.getINode() != null + (storedBlock.getGenerationStamp() = block.getGenerationStamp() || + storedBlock.getINode().isUnderConstruction())) { +NameNode.stateChangeLog.info(BLOCK* NameSystem.addStoredBlock: + + addStoredBlock request received for + block + on + + node.getName() + size + block.getNumBytes() + + and it belongs to a file under construction. ); + } else { +storedBlock = null; + } {code} Events are as follows. 1) DN1-DN2-DN3 are in pipeline with genstamp 1 2) Client completed writing and closed the file. 3) Now DN3 killed. 4) Now reopened the file in append. 5) Now Pipeline contains DN1-DN2 with genstamp 2 6) Client continues writing some more data. 7) Now DN3 started. Replica presents in current directory as this was already finalized before. 8) DN3 triggered blockReport. 9) Since this block with genstamp 1 is not there in BlocksMap, it is trying to get the block with WILDCARD. And able to get the block. Will contains newer genstamp(2). 10) Since the file is UnderConstruction, it will just accept the block and updates in BlocksMap. Problem is, if the client gets DN3 for read, they will fail because, NN may give the block ID with latest genstamp (2), and DN3 does not contain the block with genstamp 2. Of cource data also inconsistent. Thanks Uma In HDFS, sync() not yet guarantees data available to the new readers Key: HDFS-200 URL: https://issues.apache.org/jira/browse/HDFS-200 Project: Hadoop HDFS Issue Type: New Feature Affects Versions: 0.20-append Reporter: Tsz Wo (Nicholas), SZE Assignee: dhruba borthakur Priority: Blocker Fix For: 0.20-append, 0.20.205.0 Attachments: 4379_20081010TC3.java, HDFS-200.20-security.1.patch, Reader.java, Reader.java, ReopenProblem.java, Writer.java, Writer.java, checkLeases-fix-1.txt, checkLeases-fix-unit-test-1.txt, fsyncConcurrentReaders.txt, fsyncConcurrentReaders11_20.txt, fsyncConcurrentReaders12_20.txt, fsyncConcurrentReaders13_20.txt, fsyncConcurrentReaders14_20.txt, fsyncConcurrentReaders15_20.txt, fsyncConcurrentReaders16_20.txt, fsyncConcurrentReaders3.patch, fsyncConcurrentReaders4.patch, fsyncConcurrentReaders5.txt, fsyncConcurrentReaders6.patch, fsyncConcurrentReaders9.patch, hadoop-stack-namenode-aa0-000-12.u.powerset.com.log.gz, hdfs-200-ryan-existing-file-fail.txt, hypertable-namenode.log.gz, namenode.log, namenode.log, reopen_test.sh In the append design doc (https://issues.apache.org/jira/secure/attachment/12370562/Appends.doc), it says * A reader is guaranteed to be able to read data that was 'flushed' before the reader opened the file However, this feature is not yet implemented. Note that the operation 'flushed' is now called sync. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3120) Enable hsync and hflush by default
[ https://issues.apache.org/jira/browse/HDFS-3120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eli Collins updated HDFS-3120: -- Target Version/s: 2.0.0 (was: 2.0.0, 1.1.0) Affects Version/s: (was: 1.0.1) Summary: Enable hsync and hflush by default (was: Provide ability to enable sync without append) Enable hsync and hflush by default -- Key: HDFS-3120 URL: https://issues.apache.org/jira/browse/HDFS-3120 Project: Hadoop HDFS Issue Type: Improvement Reporter: Eli Collins Assignee: Eli Collins Attachments: hdfs-3120.txt The work on branch-20-append was to support *sync*, for durable HBase WALs, not *append*. The branch-20-append implementation is known to be buggy. There's been confusion about this, we often answer queries on the list [like this|http://search-hadoop.com/m/wfed01VOIJ5]. Unfortunately, the way to enable correct sync on branch-1 for HBase is to set dfs.support.append to true in your config, which has the side effect of enabling append (which we don't want to do). Let's add a new *dfs.support.sync* option that enables working sync (which is basically the current dfs.support.append flag modulo one place where it's not referring to sync). For compatibility, if dfs.support.append is set, dfs.support.sync will be set as well. This way someone can enable sync for HBase and still keep the current behavior that if dfs.support.append is not set then an append operation will result in an IOE indicating append is not supported. We should do this on trunk as well, as there's no reason to conflate hsync and append with a single config even if append works. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3120) Provide ability to enable sync without append
[ https://issues.apache.org/jira/browse/HDFS-3120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eli Collins updated HDFS-3120: -- Attachment: hdfs-3120.txt Patch attached. - Enables non-append hsync/hflush paths by default - Adds a dfs.support.appends entry to hdfs-default.xml - Removes explicit enablement of dfs.support.appends from tests so we cover that append is enabled by default Provide ability to enable sync without append - Key: HDFS-3120 URL: https://issues.apache.org/jira/browse/HDFS-3120 Project: Hadoop HDFS Issue Type: Improvement Reporter: Eli Collins Assignee: Eli Collins Attachments: hdfs-3120.txt The work on branch-20-append was to support *sync*, for durable HBase WALs, not *append*. The branch-20-append implementation is known to be buggy. There's been confusion about this, we often answer queries on the list [like this|http://search-hadoop.com/m/wfed01VOIJ5]. Unfortunately, the way to enable correct sync on branch-1 for HBase is to set dfs.support.append to true in your config, which has the side effect of enabling append (which we don't want to do). Let's add a new *dfs.support.sync* option that enables working sync (which is basically the current dfs.support.append flag modulo one place where it's not referring to sync). For compatibility, if dfs.support.append is set, dfs.support.sync will be set as well. This way someone can enable sync for HBase and still keep the current behavior that if dfs.support.append is not set then an append operation will result in an IOE indicating append is not supported. We should do this on trunk as well, as there's no reason to conflate hsync and append with a single config even if append works. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3120) Enable hsync and hflush by default
[ https://issues.apache.org/jira/browse/HDFS-3120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eli Collins updated HDFS-3120: -- Status: Patch Available (was: Open) Enable hsync and hflush by default -- Key: HDFS-3120 URL: https://issues.apache.org/jira/browse/HDFS-3120 Project: Hadoop HDFS Issue Type: Improvement Reporter: Eli Collins Assignee: Eli Collins Attachments: hdfs-3120.txt The work on branch-20-append was to support *sync*, for durable HBase WALs, not *append*. The branch-20-append implementation is known to be buggy. There's been confusion about this, we often answer queries on the list [like this|http://search-hadoop.com/m/wfed01VOIJ5]. Unfortunately, the way to enable correct sync on branch-1 for HBase is to set dfs.support.append to true in your config, which has the side effect of enabling append (which we don't want to do). Let's add a new *dfs.support.sync* option that enables working sync (which is basically the current dfs.support.append flag modulo one place where it's not referring to sync). For compatibility, if dfs.support.append is set, dfs.support.sync will be set as well. This way someone can enable sync for HBase and still keep the current behavior that if dfs.support.append is not set then an append operation will result in an IOE indicating append is not supported. We should do this on trunk as well, as there's no reason to conflate hsync and append with a single config even if append works. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3077) Quorum-based protocol for reading and writing edit logs
[ https://issues.apache.org/jira/browse/HDFS-3077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13242592#comment-13242592 ] Sanjay Radia commented on HDFS-3077: *JournalDaemons or Bookies on Datanodes (Slave nodes) vs Master nodes* * Slave nodes such as Datanodes are decommissioned for various reasons. This is automatically handled and is a simple process that makes the operations of Hadoop simple. Hence if we add a Journal Daemon (or thread) or Bookie to the slave nodes it makes the system harder to manage. * It is preferable to run the Journal Daemons or Bookies on the master nodes - NN, ZK, JT etc. Quorum-based protocol for reading and writing edit logs --- Key: HDFS-3077 URL: https://issues.apache.org/jira/browse/HDFS-3077 Project: Hadoop HDFS Issue Type: New Feature Components: ha, name-node Reporter: Todd Lipcon Assignee: Todd Lipcon Currently, one of the weak points of the HA design is that it relies on shared storage such as an NFS filer for the shared edit log. One alternative that has been proposed is to depend on BookKeeper, a ZooKeeper subproject which provides a highly available replicated edit log on commodity hardware. This JIRA is to implement another alternative, based on a quorum commit protocol, integrated more tightly in HDFS and with the requirements driven only by HDFS's needs rather than more generic use cases. More details to follow. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3138) Move DatanodeInfo#ipcPort to DatanodeID
[ https://issues.apache.org/jira/browse/HDFS-3138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eli Collins updated HDFS-3138: -- Resolution: Fixed Fix Version/s: 2.0.0 Target Version/s: (was: 2.0.0) Release Note: This change modifies DatanodeID, which is part of the client to server protocol, therefore clients must be upgraded with servers. Hadoop Flags: Incompatible change,Reviewed (was: Reviewed) Status: Resolved (was: Patch Available) The test failure is HDFS-3041. I've committed this and merged to branch-2. Thanks for the review ATM. Move DatanodeInfo#ipcPort to DatanodeID --- Key: HDFS-3138 URL: https://issues.apache.org/jira/browse/HDFS-3138 Project: Hadoop HDFS Issue Type: Improvement Reporter: Eli Collins Assignee: Eli Collins Fix For: 2.0.0 Attachments: hdfs-3138.txt, hdfs-3138.txt We can fix the following TODO once HDFS-3137 is committed. {code} //TODO: move it to DatanodeID once DatanodeID is not stored in FSImage out.writeShort(ipcPort); {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3077) Quorum-based protocol for reading and writing edit logs
[ https://issues.apache.org/jira/browse/HDFS-3077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13242603#comment-13242603 ] Sanjay Radia commented on HDFS-3077: Also one more thing * The master nodes have a spare disk that can be dedicated to JournalDaemon or Bookie, while a datanode does not have a spare disk to dedicate. Quorum-based protocol for reading and writing edit logs --- Key: HDFS-3077 URL: https://issues.apache.org/jira/browse/HDFS-3077 Project: Hadoop HDFS Issue Type: New Feature Components: ha, name-node Reporter: Todd Lipcon Assignee: Todd Lipcon Currently, one of the weak points of the HA design is that it relies on shared storage such as an NFS filer for the shared edit log. One alternative that has been proposed is to depend on BookKeeper, a ZooKeeper subproject which provides a highly available replicated edit log on commodity hardware. This JIRA is to implement another alternative, based on a quorum commit protocol, integrated more tightly in HDFS and with the requirements driven only by HDFS's needs rather than more generic use cases. More details to follow. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3138) Move DatanodeInfo#ipcPort to DatanodeID
[ https://issues.apache.org/jira/browse/HDFS-3138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13242604#comment-13242604 ] Hudson commented on HDFS-3138: -- Integrated in Hadoop-Hdfs-trunk-Commit #2030 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/2030/]) HDFS-3138. Move DatanodeInfo#ipcPort to DatanodeID. Contributed by Eli Collins (Revision 1307553) Result = SUCCESS eli : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1307553 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/DatanodeID.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/DatanodeInfo.java Move DatanodeInfo#ipcPort to DatanodeID --- Key: HDFS-3138 URL: https://issues.apache.org/jira/browse/HDFS-3138 Project: Hadoop HDFS Issue Type: Improvement Reporter: Eli Collins Assignee: Eli Collins Fix For: 2.0.0 Attachments: hdfs-3138.txt, hdfs-3138.txt We can fix the following TODO once HDFS-3137 is committed. {code} //TODO: move it to DatanodeID once DatanodeID is not stored in FSImage out.writeShort(ipcPort); {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3138) Move DatanodeInfo#ipcPort to DatanodeID
[ https://issues.apache.org/jira/browse/HDFS-3138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13242608#comment-13242608 ] Hudson commented on HDFS-3138: -- Integrated in Hadoop-Common-trunk-Commit #1955 (See [https://builds.apache.org/job/Hadoop-Common-trunk-Commit/1955/]) HDFS-3138. Move DatanodeInfo#ipcPort to DatanodeID. Contributed by Eli Collins (Revision 1307553) Result = SUCCESS eli : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1307553 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/DatanodeID.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/DatanodeInfo.java Move DatanodeInfo#ipcPort to DatanodeID --- Key: HDFS-3138 URL: https://issues.apache.org/jira/browse/HDFS-3138 Project: Hadoop HDFS Issue Type: Improvement Reporter: Eli Collins Assignee: Eli Collins Fix For: 2.0.0 Attachments: hdfs-3138.txt, hdfs-3138.txt We can fix the following TODO once HDFS-3137 is committed. {code} //TODO: move it to DatanodeID once DatanodeID is not stored in FSImage out.writeShort(ipcPort); {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3077) Quorum-based protocol for reading and writing edit logs
[ https://issues.apache.org/jira/browse/HDFS-3077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13242616#comment-13242616 ] Todd Lipcon commented on HDFS-3077: --- Thanks for your comments, Sanjay. I agree on all points above. Sorry to have not reported progress on this - I've been spending my time primarily on HDFS-3042 for the past two weeks. I hope to make more progress on this soon. The current status is that I have implemented the basic quorum protocol for writes, but still in progress implementing recovery after a switch of the active node. Quorum-based protocol for reading and writing edit logs --- Key: HDFS-3077 URL: https://issues.apache.org/jira/browse/HDFS-3077 Project: Hadoop HDFS Issue Type: New Feature Components: ha, name-node Reporter: Todd Lipcon Assignee: Todd Lipcon Currently, one of the weak points of the HA design is that it relies on shared storage such as an NFS filer for the shared edit log. One alternative that has been proposed is to depend on BookKeeper, a ZooKeeper subproject which provides a highly available replicated edit log on commodity hardware. This JIRA is to implement another alternative, based on a quorum commit protocol, integrated more tightly in HDFS and with the requirements driven only by HDFS's needs rather than more generic use cases. More details to follow. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3120) Enable hsync and hflush by default
[ https://issues.apache.org/jira/browse/HDFS-3120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13242621#comment-13242621 ] Hadoop QA commented on HDFS-3120: - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12520627/hdfs-3120.txt against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 51 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 eclipse:eclipse. The patch built with eclipse:eclipse. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hdfs.TestFileAppend4 +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/2127//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2127//console This message is automatically generated. Enable hsync and hflush by default -- Key: HDFS-3120 URL: https://issues.apache.org/jira/browse/HDFS-3120 Project: Hadoop HDFS Issue Type: Improvement Reporter: Eli Collins Assignee: Eli Collins Attachments: hdfs-3120.txt The work on branch-20-append was to support *sync*, for durable HBase WALs, not *append*. The branch-20-append implementation is known to be buggy. There's been confusion about this, we often answer queries on the list [like this|http://search-hadoop.com/m/wfed01VOIJ5]. Unfortunately, the way to enable correct sync on branch-1 for HBase is to set dfs.support.append to true in your config, which has the side effect of enabling append (which we don't want to do). Let's add a new *dfs.support.sync* option that enables working sync (which is basically the current dfs.support.append flag modulo one place where it's not referring to sync). For compatibility, if dfs.support.append is set, dfs.support.sync will be set as well. This way someone can enable sync for HBase and still keep the current behavior that if dfs.support.append is not set then an append operation will result in an IOE indicating append is not supported. We should do this on trunk as well, as there's no reason to conflate hsync and append with a single config even if append works. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Moved] (HDFS-3166) Hftp connections do not have a timeout
[ https://issues.apache.org/jira/browse/HDFS-3166?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe moved HADOOP-8221 to HDFS-3166: -- Component/s: (was: fs) hdfs client Target Version/s: 0.23.2 (was: 0.23.2) Affects Version/s: (was: 0.24.0) (was: 0.23.0) 3.0.0 2.0.0 0.23.3 0.23.2 Key: HDFS-3166 (was: HADOOP-8221) Project: Hadoop HDFS (was: Hadoop Common) Hftp connections do not have a timeout -- Key: HDFS-3166 URL: https://issues.apache.org/jira/browse/HDFS-3166 Project: Hadoop HDFS Issue Type: Bug Components: hdfs client Affects Versions: 0.23.2, 0.23.3, 2.0.0, 3.0.0 Reporter: Daryn Sharp Assignee: Daryn Sharp Priority: Critical Attachments: HADOOP-8221.branch-1.patch, HADOOP-8221.patch Hftp connections do not have read timeouts. This leads to indefinitely hung sockets when there is a network outage during which time the remote host closed the socket. This may also affect WebHdfs, etc. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3166) Hftp connections do not have a timeout
[ https://issues.apache.org/jira/browse/HDFS-3166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13242625#comment-13242625 ] Jason Lowe commented on HDFS-3166: -- I think patch is not applying since these files are all in the HDFS project. Moving the bug to HDFS. Hftp connections do not have a timeout -- Key: HDFS-3166 URL: https://issues.apache.org/jira/browse/HDFS-3166 Project: Hadoop HDFS Issue Type: Bug Components: hdfs client Affects Versions: 0.23.2, 0.23.3, 2.0.0, 3.0.0 Reporter: Daryn Sharp Assignee: Daryn Sharp Priority: Critical Attachments: HADOOP-8221.branch-1.patch, HADOOP-8221.patch Hftp connections do not have read timeouts. This leads to indefinitely hung sockets when there is a network outage during which time the remote host closed the socket. This may also affect WebHdfs, etc. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3130) Move FSDataset implemenation to a package
[ https://issues.apache.org/jira/browse/HDFS-3130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13242628#comment-13242628 ] Tsz Wo (Nicholas), SZE commented on HDFS-3130: -- Thanks Uma for checking it. Will update the findbugs-exclude file. The javadoc warning was an existing problem: .../datanode/ReplicaInPipeline.java:55: warning - @param argument state is not a parameter name. The reason TestDatanodeRestart failure is that the dataset needs to be updated after restart. It is a bug in the patch. Move FSDataset implemenation to a package - Key: HDFS-3130 URL: https://issues.apache.org/jira/browse/HDFS-3130 Project: Hadoop HDFS Issue Type: Sub-task Components: data-node Reporter: Tsz Wo (Nicholas), SZE Assignee: Tsz Wo (Nicholas), SZE Attachments: h3130_20120328_svn_mv.patch, h3130_20120329b.patch, h3130_20120329b_svn_mv.patch, svn_mv.sh, svn_mv.sh -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3044) fsck move should be non-destructive by default
[ https://issues.apache.org/jira/browse/HDFS-3044?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Colin Patrick McCabe updated HDFS-3044: --- Attachment: HDFS-3044-b1.004.patch * remember to run fsck -delete before checking to see if the file is really deleted (d'oh!) * add test that running fsck -move a few times in a row has no harmful effects fsck move should be non-destructive by default -- Key: HDFS-3044 URL: https://issues.apache.org/jira/browse/HDFS-3044 Project: Hadoop HDFS Issue Type: Improvement Components: name-node Reporter: Eli Collins Assignee: Colin Patrick McCabe Fix For: 1.1.0, 2.0.0 Attachments: HDFS-3044-b1.002.patch, HDFS-3044-b1.004.patch, HDFS-3044.002.patch, HDFS-3044.003.patch The fsck move behavior in the code and originally articulated in HADOOP-101 is: {quote}Current failure modes for DFS involve blocks that are completely missing. The only way to fix them would be to recover chains of blocks and put them into lost+found{quote} A directory is created with the file name, the blocks that are accessible are created as individual files in this directory, then the original file is removed. I suspect the rationale for this behavior was that you can't use files that are missing locations, and copying the block as files at least makes part of the files accessible. However this behavior can also result in permanent dataloss. Eg: - Some datanodes don't come up (eg due to a HW issues) and checkin on cluster startup, files with blocks where all replicas are on these set of datanodes are marked corrupt - Admin does fsck move, which deletes the corrupt files, saves whatever blocks were available - The HW issues with datanodes are resolved, they are started and join the cluster. The NN tells them to delete their blocks for the corrupt files since the file was deleted. I think we should: - Make fsck move non-destructive by default (eg just does a move into lost+found) - Make the destructive behavior optional (eg --destructive so admins think about what they're doing) - Provide better sanity checks and warnings, eg if you're running fsck and not all the slaves have checked in (if using dfs.hosts) then fsck should print a warning indicating this that an admin should have to override if they want to do something destructive -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3138) Move DatanodeInfo#ipcPort to DatanodeID
[ https://issues.apache.org/jira/browse/HDFS-3138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13242637#comment-13242637 ] Hudson commented on HDFS-3138: -- Integrated in Hadoop-Mapreduce-trunk-Commit #1968 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/1968/]) HDFS-3138. Move DatanodeInfo#ipcPort to DatanodeID. Contributed by Eli Collins (Revision 1307553) Result = ABORTED eli : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1307553 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/DatanodeID.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/DatanodeInfo.java Move DatanodeInfo#ipcPort to DatanodeID --- Key: HDFS-3138 URL: https://issues.apache.org/jira/browse/HDFS-3138 Project: Hadoop HDFS Issue Type: Improvement Reporter: Eli Collins Assignee: Eli Collins Fix For: 2.0.0 Attachments: hdfs-3138.txt, hdfs-3138.txt We can fix the following TODO once HDFS-3137 is committed. {code} //TODO: move it to DatanodeID once DatanodeID is not stored in FSImage out.writeShort(ipcPort); {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3077) Quorum-based protocol for reading and writing edit logs
[ https://issues.apache.org/jira/browse/HDFS-3077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13242659#comment-13242659 ] Bikas Saha commented on HDFS-3077: -- You might want to switch this jira to HDFS-3042 and take the discussion off trunk. Quorum-based protocol for reading and writing edit logs --- Key: HDFS-3077 URL: https://issues.apache.org/jira/browse/HDFS-3077 Project: Hadoop HDFS Issue Type: New Feature Components: ha, name-node Reporter: Todd Lipcon Assignee: Todd Lipcon Currently, one of the weak points of the HA design is that it relies on shared storage such as an NFS filer for the shared edit log. One alternative that has been proposed is to depend on BookKeeper, a ZooKeeper subproject which provides a highly available replicated edit log on commodity hardware. This JIRA is to implement another alternative, based on a quorum commit protocol, integrated more tightly in HDFS and with the requirements driven only by HDFS's needs rather than more generic use cases. More details to follow. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3077) Quorum-based protocol for reading and writing edit logs
[ https://issues.apache.org/jira/browse/HDFS-3077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13242663#comment-13242663 ] Todd Lipcon commented on HDFS-3077: --- While this work will combine nicely with HDFS-3042 (auto failover) it is a separate project. A quorum-based edit log storage mechanism is useful even for manual failover -- or even for non-HA environments where you want remote copies of the edit logs without deploying NFS. Quorum-based protocol for reading and writing edit logs --- Key: HDFS-3077 URL: https://issues.apache.org/jira/browse/HDFS-3077 Project: Hadoop HDFS Issue Type: New Feature Components: ha, name-node Reporter: Todd Lipcon Assignee: Todd Lipcon Currently, one of the weak points of the HA design is that it relies on shared storage such as an NFS filer for the shared edit log. One alternative that has been proposed is to depend on BookKeeper, a ZooKeeper subproject which provides a highly available replicated edit log on commodity hardware. This JIRA is to implement another alternative, based on a quorum commit protocol, integrated more tightly in HDFS and with the requirements driven only by HDFS's needs rather than more generic use cases. More details to follow. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3130) Move FSDataset implemenation to a package
[ https://issues.apache.org/jira/browse/HDFS-3130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo (Nicholas), SZE updated HDFS-3130: - Attachment: h3130_20120330_svn_mv.patch h3130_20120330_svn_mv.patch: fixes findbugs, javadoc and TestDatanodeRestart. Move FSDataset implemenation to a package - Key: HDFS-3130 URL: https://issues.apache.org/jira/browse/HDFS-3130 Project: Hadoop HDFS Issue Type: Sub-task Components: data-node Reporter: Tsz Wo (Nicholas), SZE Assignee: Tsz Wo (Nicholas), SZE Attachments: h3130_20120328_svn_mv.patch, h3130_20120329b.patch, h3130_20120329b_svn_mv.patch, h3130_20120330.patch, h3130_20120330_svn_mv.patch, svn_mv.sh, svn_mv.sh -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3130) Move FSDataset implemenation to a package
[ https://issues.apache.org/jira/browse/HDFS-3130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo (Nicholas), SZE updated HDFS-3130: - Attachment: h3130_20120330.patch h3130_20120330.patch: for jenkins. Move FSDataset implemenation to a package - Key: HDFS-3130 URL: https://issues.apache.org/jira/browse/HDFS-3130 Project: Hadoop HDFS Issue Type: Sub-task Components: data-node Reporter: Tsz Wo (Nicholas), SZE Assignee: Tsz Wo (Nicholas), SZE Attachments: h3130_20120328_svn_mv.patch, h3130_20120329b.patch, h3130_20120329b_svn_mv.patch, h3130_20120330.patch, h3130_20120330_svn_mv.patch, svn_mv.sh, svn_mv.sh -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3130) Move FSDataset implemenation to a package
[ https://issues.apache.org/jira/browse/HDFS-3130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo (Nicholas), SZE updated HDFS-3130: - Attachment: h3130_20120330.patch Move FSDataset implemenation to a package - Key: HDFS-3130 URL: https://issues.apache.org/jira/browse/HDFS-3130 Project: Hadoop HDFS Issue Type: Sub-task Components: data-node Reporter: Tsz Wo (Nicholas), SZE Assignee: Tsz Wo (Nicholas), SZE Attachments: h3130_20120328_svn_mv.patch, h3130_20120329b.patch, h3130_20120329b_svn_mv.patch, h3130_20120330.patch, h3130_20120330_svn_mv.patch, svn_mv.sh, svn_mv.sh -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3130) Move FSDataset implemenation to a package
[ https://issues.apache.org/jira/browse/HDFS-3130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo (Nicholas), SZE updated HDFS-3130: - Attachment: (was: h3130_20120330.patch) Move FSDataset implemenation to a package - Key: HDFS-3130 URL: https://issues.apache.org/jira/browse/HDFS-3130 Project: Hadoop HDFS Issue Type: Sub-task Components: data-node Reporter: Tsz Wo (Nicholas), SZE Assignee: Tsz Wo (Nicholas), SZE Attachments: h3130_20120328_svn_mv.patch, h3130_20120329b.patch, h3130_20120329b_svn_mv.patch, h3130_20120330.patch, h3130_20120330_svn_mv.patch, svn_mv.sh, svn_mv.sh -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3077) Quorum-based protocol for reading and writing edit logs
[ https://issues.apache.org/jira/browse/HDFS-3077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13242697#comment-13242697 ] Andrew Purtell commented on HDFS-3077: -- Re: JournalDaemons or Bookies on Datanodes (Slave nodes) vs Master nodes Makes sense. However, there will be many more Datanodes than metadata nodes, so finding new candidates to participate in a quorum protocol as others are lost or decommissioned would be less challenging given that larger pool. For each federated HDFS volume will we need 3 metadata nodes? Quorum-based protocol for reading and writing edit logs --- Key: HDFS-3077 URL: https://issues.apache.org/jira/browse/HDFS-3077 Project: Hadoop HDFS Issue Type: New Feature Components: ha, name-node Reporter: Todd Lipcon Assignee: Todd Lipcon Currently, one of the weak points of the HA design is that it relies on shared storage such as an NFS filer for the shared edit log. One alternative that has been proposed is to depend on BookKeeper, a ZooKeeper subproject which provides a highly available replicated edit log on commodity hardware. This JIRA is to implement another alternative, based on a quorum commit protocol, integrated more tightly in HDFS and with the requirements driven only by HDFS's needs rather than more generic use cases. More details to follow. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3077) Quorum-based protocol for reading and writing edit logs
[ https://issues.apache.org/jira/browse/HDFS-3077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13242704#comment-13242704 ] Todd Lipcon commented on HDFS-3077: --- bq. Makes sense. However, there will be many more Datanodes than metadata nodes, so finding new candidates to participate in a quorum protocol as others are lost or decommissioned would be less challenging given that larger pool. True. In the initial implementation, though, I don't plan to support online reconfiguration of the quorum participants, though. It would be a nice enhancement in the future. bq. For each federated HDFS volume will we need 3 metadata nodes? I was planning to have the journal daemons store the logs in a directory per namespace ID. So, one set of nodes could handle the logs for several NNs (obviously at the expense of performance if there are lots and lots of them). Quorum-based protocol for reading and writing edit logs --- Key: HDFS-3077 URL: https://issues.apache.org/jira/browse/HDFS-3077 Project: Hadoop HDFS Issue Type: New Feature Components: ha, name-node Reporter: Todd Lipcon Assignee: Todd Lipcon Currently, one of the weak points of the HA design is that it relies on shared storage such as an NFS filer for the shared edit log. One alternative that has been proposed is to depend on BookKeeper, a ZooKeeper subproject which provides a highly available replicated edit log on commodity hardware. This JIRA is to implement another alternative, based on a quorum commit protocol, integrated more tightly in HDFS and with the requirements driven only by HDFS's needs rather than more generic use cases. More details to follow. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3016) Security in unit tests
[ https://issues.apache.org/jira/browse/HDFS-3016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13242722#comment-13242722 ] Todd Lipcon commented on HDFS-3016: --- Hi Jaimin, Jitendra. I tried to run this test case, but it doesn't seem to work here: todd@todd-w510:~/git/hadoop-common/hadoop-hdfs-project/hadoop-hdfs$ mvn test -PstartKdc -DstartKdc=true -Dtest=TestSecureNameNode ... Tests in error: testName(org.apache.hadoop.hdfs.server.namenode.TestSecureNameNode): Failed on local exception: java.io.IOException: Couldn't setup connection for nn1/localh...@example.com to nn1/localh...@example.com; Host Details : local host is: todd-w510/127.0.0.1; destination host is: todd-w510:40329; The test logs show: 2012-03-30 13:39:06,611 ERROR security.UserGroupInformation (UserGroupInformation.java:doAs(1208)) - PriviledgedActionException as:nn1/localh...@example.com (auth:KERBEROS) cause:java.io.IOException: Couldn't setup connection for nn1/localh...@example.com to nn1/localh...@example.com Am I making a mistake in the invocation above? Looking at the actual users.ldif file, I don't see how this can work on any machine which actually has a configured hostname other than localhost, since the principals are inserted with localhost hostnames rather than something dynamically generated based on the actual hostname. Security in unit tests --- Key: HDFS-3016 URL: https://issues.apache.org/jira/browse/HDFS-3016 Project: Hadoop HDFS Issue Type: Improvement Reporter: Jaimin D Jetly Assignee: Jaimin D Jetly Attachments: HDFS-3016.patch This jira tracks HDFS changes corresponding to HADOOP-8078. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3050) rework OEV to share more code with the NameNode
[ https://issues.apache.org/jira/browse/HDFS-3050?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Colin Patrick McCabe updated HDFS-3050: --- Description: Current, OEV (the offline edits viewer) re-implements all of the opcode parsing logic found in the NameNode. This duplicated code creates a maintenance burden for us. OEV should be refactored to simply use the normal EditLog parsing code, rather than rolling its own. By using the existing FSEditLogLoader code to load edits in OEV, we can avoid having to update two places when the format changes. We should not put opcode checksums into the XML, because they are a serialization detail, not related to what the data is what we're storing. This will also make it possible to modify the XML file and translate this modified file back to a binary edits log file. Finally, this changes introduces --fix-txids. When OEV is passed this flag, it will close gaps in the transaction log by modifying the sequence numbers. This is useful if you want to modify the edit log XML (say, by removing a transaction), and transform the modified XML back into a valid binary edit log file. was: Current, OEV (the offline edits viewer) re-implements all of the opcode parsing logic found in the NameNode. This duplicated code creates a maintenance burden for us. OEV should be refactored to simply use the normal EditLog parsing code, rather than rolling its own. Summary: rework OEV to share more code with the NameNode (was: refactor OEV to share more code with the NameNode) rework OEV to share more code with the NameNode --- Key: HDFS-3050 URL: https://issues.apache.org/jira/browse/HDFS-3050 Project: Hadoop HDFS Issue Type: Improvement Components: name-node Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Priority: Minor Attachments: HDFS-3050.006.patch, HDFS-3050.007.patch, HDFS-3050.008.patch, HDFS-3050.009.patch, HDFS-3050.010.patch, HDFS-3050.011.patch, HDFS-3050.012.patch, HDFS-3050.014.patch, HDFS-3050.015.patch Current, OEV (the offline edits viewer) re-implements all of the opcode parsing logic found in the NameNode. This duplicated code creates a maintenance burden for us. OEV should be refactored to simply use the normal EditLog parsing code, rather than rolling its own. By using the existing FSEditLogLoader code to load edits in OEV, we can avoid having to update two places when the format changes. We should not put opcode checksums into the XML, because they are a serialization detail, not related to what the data is what we're storing. This will also make it possible to modify the XML file and translate this modified file back to a binary edits log file. Finally, this changes introduces --fix-txids. When OEV is passed this flag, it will close gaps in the transaction log by modifying the sequence numbers. This is useful if you want to modify the edit log XML (say, by removing a transaction), and transform the modified XML back into a valid binary edit log file. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3050) rework OEV to share more code with the NameNode
[ https://issues.apache.org/jira/browse/HDFS-3050?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Colin Patrick McCabe updated HDFS-3050: --- Attachment: HDFS-3050.016.patch * rebase against current trunk * posted improved patch description and name rework OEV to share more code with the NameNode --- Key: HDFS-3050 URL: https://issues.apache.org/jira/browse/HDFS-3050 Project: Hadoop HDFS Issue Type: Improvement Components: name-node Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Priority: Minor Attachments: HDFS-3050.006.patch, HDFS-3050.007.patch, HDFS-3050.008.patch, HDFS-3050.009.patch, HDFS-3050.010.patch, HDFS-3050.011.patch, HDFS-3050.012.patch, HDFS-3050.014.patch, HDFS-3050.015.patch, HDFS-3050.016.patch Current, OEV (the offline edits viewer) re-implements all of the opcode parsing logic found in the NameNode. This duplicated code creates a maintenance burden for us. OEV should be refactored to simply use the normal EditLog parsing code, rather than rolling its own. By using the existing FSEditLogLoader code to load edits in OEV, we can avoid having to update two places when the format changes. We should not put opcode checksums into the XML, because they are a serialization detail, not related to what the data is what we're storing. This will also make it possible to modify the XML file and translate this modified file back to a binary edits log file. Finally, this changes introduces --fix-txids. When OEV is passed this flag, it will close gaps in the transaction log by modifying the sequence numbers. This is useful if you want to modify the edit log XML (say, by removing a transaction), and transform the modified XML back into a valid binary edit log file. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3130) Move FSDataset implemenation to a package
[ https://issues.apache.org/jira/browse/HDFS-3130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13242739#comment-13242739 ] Hadoop QA commented on HDFS-3130: - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12520657/h3130_20120330.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 54 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 eclipse:eclipse. The patch built with eclipse:eclipse. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed unit tests in . +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/2129//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2129//console This message is automatically generated. Move FSDataset implemenation to a package - Key: HDFS-3130 URL: https://issues.apache.org/jira/browse/HDFS-3130 Project: Hadoop HDFS Issue Type: Sub-task Components: data-node Reporter: Tsz Wo (Nicholas), SZE Assignee: Tsz Wo (Nicholas), SZE Attachments: h3130_20120328_svn_mv.patch, h3130_20120329b.patch, h3130_20120329b_svn_mv.patch, h3130_20120330.patch, h3130_20120330_svn_mv.patch, svn_mv.sh, svn_mv.sh -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3050) rework OEV to share more code with the NameNode
[ https://issues.apache.org/jira/browse/HDFS-3050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13242743#comment-13242743 ] Colin Patrick McCabe commented on HDFS-3050: Nicholas said: There are editlog xml format changes. Is there a compatibility issue? The offline edits viewer has @InterfaceAudience.Private, @InterfaceStability.Unstable. There are no proposals (as far as I know) to change that. We would like to change OIV to be @InterfaceAudience.Public, but that is a different issue. cheers, Colin rework OEV to share more code with the NameNode --- Key: HDFS-3050 URL: https://issues.apache.org/jira/browse/HDFS-3050 Project: Hadoop HDFS Issue Type: Improvement Components: name-node Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Priority: Minor Attachments: HDFS-3050.006.patch, HDFS-3050.007.patch, HDFS-3050.008.patch, HDFS-3050.009.patch, HDFS-3050.010.patch, HDFS-3050.011.patch, HDFS-3050.012.patch, HDFS-3050.014.patch, HDFS-3050.015.patch, HDFS-3050.016.patch Current, OEV (the offline edits viewer) re-implements all of the opcode parsing logic found in the NameNode. This duplicated code creates a maintenance burden for us. OEV should be refactored to simply use the normal EditLog parsing code, rather than rolling its own. By using the existing FSEditLogLoader code to load edits in OEV, we can avoid having to update two places when the format changes. We should not put opcode checksums into the XML, because they are a serialization detail, not related to what the data is what we're storing. This will also make it possible to modify the XML file and translate this modified file back to a binary edits log file. Finally, this changes introduces --fix-txids. When OEV is passed this flag, it will close gaps in the transaction log by modifying the sequence numbers. This is useful if you want to modify the edit log XML (say, by removing a transaction), and transform the modified XML back into a valid binary edit log file. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3050) rework OEV to share more code with the NameNode
[ https://issues.apache.org/jira/browse/HDFS-3050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13242751#comment-13242751 ] Tsz Wo (Nicholas), SZE commented on HDFS-3050: -- The offline edits viewer has @InterfaceAudience.Private, @InterfaceStability.Unstable. There are no proposals (as far as I know) to change that. You are right. rework OEV to share more code with the NameNode --- Key: HDFS-3050 URL: https://issues.apache.org/jira/browse/HDFS-3050 Project: Hadoop HDFS Issue Type: Improvement Components: name-node Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Priority: Minor Attachments: HDFS-3050.006.patch, HDFS-3050.007.patch, HDFS-3050.008.patch, HDFS-3050.009.patch, HDFS-3050.010.patch, HDFS-3050.011.patch, HDFS-3050.012.patch, HDFS-3050.014.patch, HDFS-3050.015.patch, HDFS-3050.016.patch Current, OEV (the offline edits viewer) re-implements all of the opcode parsing logic found in the NameNode. This duplicated code creates a maintenance burden for us. OEV should be refactored to simply use the normal EditLog parsing code, rather than rolling its own. By using the existing FSEditLogLoader code to load edits in OEV, we can avoid having to update two places when the format changes. We should not put opcode checksums into the XML, because they are a serialization detail, not related to what the data is what we're storing. This will also make it possible to modify the XML file and translate this modified file back to a binary edits log file. Finally, this changes introduces --fix-txids. When OEV is passed this flag, it will close gaps in the transaction log by modifying the sequence numbers. This is useful if you want to modify the edit log XML (say, by removing a transaction), and transform the modified XML back into a valid binary edit log file. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3077) Quorum-based protocol for reading and writing edit logs
[ https://issues.apache.org/jira/browse/HDFS-3077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13242750#comment-13242750 ] Sanjay Radia commented on HDFS-3077: I was planning to have the journal daemons store the logs in a directory per namespace ID. BK also has a notion of Ledger to support multiple clients - we are re-inventing a fair part of BK. One of the arguments was that BK is a general system while thre JD solution has only one client - the NN. As soon as we support multiple NNs the JournalDaemon solution is effectively becoming more general. One our design is fleshed out, we should compare with BK in an objective way. Quorum-based protocol for reading and writing edit logs --- Key: HDFS-3077 URL: https://issues.apache.org/jira/browse/HDFS-3077 Project: Hadoop HDFS Issue Type: New Feature Components: ha, name-node Reporter: Todd Lipcon Assignee: Todd Lipcon Currently, one of the weak points of the HA design is that it relies on shared storage such as an NFS filer for the shared edit log. One alternative that has been proposed is to depend on BookKeeper, a ZooKeeper subproject which provides a highly available replicated edit log on commodity hardware. This JIRA is to implement another alternative, based on a quorum commit protocol, integrated more tightly in HDFS and with the requirements driven only by HDFS's needs rather than more generic use cases. More details to follow. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3050) rework OEV to share more code with the NameNode
[ https://issues.apache.org/jira/browse/HDFS-3050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13242757#comment-13242757 ] Hadoop QA commented on HDFS-3050: - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12520667/HDFS-3050.016.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 6 new or modified tests. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2130//console This message is automatically generated. rework OEV to share more code with the NameNode --- Key: HDFS-3050 URL: https://issues.apache.org/jira/browse/HDFS-3050 Project: Hadoop HDFS Issue Type: Improvement Components: name-node Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Priority: Minor Attachments: HDFS-3050.006.patch, HDFS-3050.007.patch, HDFS-3050.008.patch, HDFS-3050.009.patch, HDFS-3050.010.patch, HDFS-3050.011.patch, HDFS-3050.012.patch, HDFS-3050.014.patch, HDFS-3050.015.patch, HDFS-3050.016.patch Current, OEV (the offline edits viewer) re-implements all of the opcode parsing logic found in the NameNode. This duplicated code creates a maintenance burden for us. OEV should be refactored to simply use the normal EditLog parsing code, rather than rolling its own. By using the existing FSEditLogLoader code to load edits in OEV, we can avoid having to update two places when the format changes. We should not put opcode checksums into the XML, because they are a serialization detail, not related to what the data is what we're storing. This will also make it possible to modify the XML file and translate this modified file back to a binary edits log file. Finally, this changes introduces --fix-txids. When OEV is passed this flag, it will close gaps in the transaction log by modifying the sequence numbers. This is useful if you want to modify the edit log XML (say, by removing a transaction), and transform the modified XML back into a valid binary edit log file. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HDFS-3167) CLI-based driver for MiniDFSCluster
CLI-based driver for MiniDFSCluster --- Key: HDFS-3167 URL: https://issues.apache.org/jira/browse/HDFS-3167 Project: Hadoop HDFS Issue Type: New Feature Components: test Reporter: Henry Robinson Assignee: Henry Robinson Priority: Minor Fix For: 0.24.0 Picking up a thread again from MAPREDUCE-987, I've found it very useful to have a CLI driver for running a single-process DFS cluster, particularly when developing features in HDFS clients. For example, being able to spin up a local cluster easily was tremendously useful for correctness testing of HDFS-2834. I'd like to contribute a class based on the patch for MAPREDUCE-987 we've been using fairly extensively. Only for DFS, not MR since much has changed MR-side since the original patch. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3167) CLI-based driver for MiniDFSCluster
[ https://issues.apache.org/jira/browse/HDFS-3167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Henry Robinson updated HDFS-3167: - Attachment: HDFS-3167.patch Patch for trunk. Example invocation instructions in the class Javadoc. CLI-based driver for MiniDFSCluster --- Key: HDFS-3167 URL: https://issues.apache.org/jira/browse/HDFS-3167 Project: Hadoop HDFS Issue Type: New Feature Components: test Reporter: Henry Robinson Assignee: Henry Robinson Priority: Minor Fix For: 0.24.0 Attachments: HDFS-3167.patch Picking up a thread again from MAPREDUCE-987, I've found it very useful to have a CLI driver for running a single-process DFS cluster, particularly when developing features in HDFS clients. For example, being able to spin up a local cluster easily was tremendously useful for correctness testing of HDFS-2834. I'd like to contribute a class based on the patch for MAPREDUCE-987 we've been using fairly extensively. Only for DFS, not MR since much has changed MR-side since the original patch. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3077) Quorum-based protocol for reading and writing edit logs
[ https://issues.apache.org/jira/browse/HDFS-3077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13242806#comment-13242806 ] Todd Lipcon commented on HDFS-3077: --- bq. BK also has a notion of Ledger to support multiple clients - we are re-inventing a fair part of BK. One of the arguments was that BK is a general system while thre JD solution has only one client - the NN. True, though BK takes effort to interleave all of the ledgers into a single sequential stream, while writing an index file to allow de-interleaving upon read. This is to support hundreds or thousands of concurrent WALs. In contrast, I think even large HDFS installations would only run a few federated NNs with the current design. So it's a much simpler problem, IMO. bq. One our design is fleshed out, we should compare with BK in an objective way. Absolutely. Quorum-based protocol for reading and writing edit logs --- Key: HDFS-3077 URL: https://issues.apache.org/jira/browse/HDFS-3077 Project: Hadoop HDFS Issue Type: New Feature Components: ha, name-node Reporter: Todd Lipcon Assignee: Todd Lipcon Currently, one of the weak points of the HA design is that it relies on shared storage such as an NFS filer for the shared edit log. One alternative that has been proposed is to depend on BookKeeper, a ZooKeeper subproject which provides a highly available replicated edit log on commodity hardware. This JIRA is to implement another alternative, based on a quorum commit protocol, integrated more tightly in HDFS and with the requirements driven only by HDFS's needs rather than more generic use cases. More details to follow. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3167) CLI-based driver for MiniDFSCluster
[ https://issues.apache.org/jira/browse/HDFS-3167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13242832#comment-13242832 ] Hadoop QA commented on HDFS-3167: - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12520674/HDFS-3167.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 5 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 eclipse:eclipse. The patch built with eclipse:eclipse. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed unit tests in . +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/2131//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2131//console This message is automatically generated. CLI-based driver for MiniDFSCluster --- Key: HDFS-3167 URL: https://issues.apache.org/jira/browse/HDFS-3167 Project: Hadoop HDFS Issue Type: New Feature Components: test Affects Versions: 2.0.0 Reporter: Henry Robinson Assignee: Henry Robinson Priority: Minor Attachments: HDFS-3167.patch Picking up a thread again from MAPREDUCE-987, I've found it very useful to have a CLI driver for running a single-process DFS cluster, particularly when developing features in HDFS clients. For example, being able to spin up a local cluster easily was tremendously useful for correctness testing of HDFS-2834. I'd like to contribute a class based on the patch for MAPREDUCE-987 we've been using fairly extensively. Only for DFS, not MR since much has changed MR-side since the original patch. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3077) Quorum-based protocol for reading and writing edit logs
[ https://issues.apache.org/jira/browse/HDFS-3077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13242834#comment-13242834 ] Flavio Junqueira commented on HDFS-3077: bq. True, though BK takes effort to interleave all of the ledgers into a single sequential stream, while writing an index file to allow de-interleaving upon read. We have done this interleaving originally because the performance was affected even with just a handful of log files (ledgers). The solution we have currently apply to a few or tens of thousands of ledgers. If you're concerned about read performance, we have not observed any important reduction. About de-interleaving, we don't rearrange it or anything, if that's what you have in mind. Quorum-based protocol for reading and writing edit logs --- Key: HDFS-3077 URL: https://issues.apache.org/jira/browse/HDFS-3077 Project: Hadoop HDFS Issue Type: New Feature Components: ha, name-node Reporter: Todd Lipcon Assignee: Todd Lipcon Currently, one of the weak points of the HA design is that it relies on shared storage such as an NFS filer for the shared edit log. One alternative that has been proposed is to depend on BookKeeper, a ZooKeeper subproject which provides a highly available replicated edit log on commodity hardware. This JIRA is to implement another alternative, based on a quorum commit protocol, integrated more tightly in HDFS and with the requirements driven only by HDFS's needs rather than more generic use cases. More details to follow. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3004) Implement Recovery Mode
[ https://issues.apache.org/jira/browse/HDFS-3004?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Colin Patrick McCabe updated HDFS-3004: --- Attachment: HDFS-3004.036.patch * rebase on trunk * slight cleanup of EditLogBackupInputStream::nextValidOp() Implement Recovery Mode --- Key: HDFS-3004 URL: https://issues.apache.org/jira/browse/HDFS-3004 Project: Hadoop HDFS Issue Type: New Feature Components: tools Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Attachments: HDFS-3004.010.patch, HDFS-3004.011.patch, HDFS-3004.012.patch, HDFS-3004.013.patch, HDFS-3004.015.patch, HDFS-3004.016.patch, HDFS-3004.017.patch, HDFS-3004.018.patch, HDFS-3004.019.patch, HDFS-3004.020.patch, HDFS-3004.022.patch, HDFS-3004.023.patch, HDFS-3004.024.patch, HDFS-3004.026.patch, HDFS-3004.027.patch, HDFS-3004.029.patch, HDFS-3004.030.patch, HDFS-3004.031.patch, HDFS-3004.032.patch, HDFS-3004.033.patch, HDFS-3004.034.patch, HDFS-3004.035.patch, HDFS-3004.036.patch, HDFS-3004__namenode_recovery_tool.txt When the NameNode metadata is corrupt for some reason, we want to be able to fix it. Obviously, we would prefer never to get in this case. In a perfect world, we never would. However, bad data on disk can happen from time to time, because of hardware errors or misconfigurations. In the past we have had to correct it manually, which is time-consuming and which can result in downtime. Recovery mode is initialized by the system administrator. When the NameNode starts up in Recovery Mode, it will try to load the FSImage file, apply all the edits from the edits log, and then write out a new image. Then it will shut down. Unlike in the normal startup process, the recovery mode startup process will be interactive. When the NameNode finds something that is inconsistent, it will prompt the operator as to what it should do. The operator can also choose to take the first option for all prompts by starting up with the '-f' flag, or typing 'a' at one of the prompts. I have reused as much code as possible from the NameNode in this tool. Hopefully, the effort that was spent developing this will also make the NameNode editLog and image processing even more robust than it already is. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3150) Add option for clients to contact DNs via hostname in branch-1
[ https://issues.apache.org/jira/browse/HDFS-3150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eli Collins updated HDFS-3150: -- Attachment: hdfs-3150-b1.txt Patch attached. Aside from the new tests I confirmed via debug logging that we're now connecting to DNs by hostname. Add option for clients to contact DNs via hostname in branch-1 -- Key: HDFS-3150 URL: https://issues.apache.org/jira/browse/HDFS-3150 Project: Hadoop HDFS Issue Type: Sub-task Components: data-node, hdfs client Reporter: Eli Collins Assignee: Eli Collins Attachments: hdfs-3150-b1.txt Per the document attached to HADOOP-8198, this is just for branch-1, and unbreaks DN multihoming. The datanode can be configured to listen on a bond, or all interfaces by specifying the wildcard in the dfs.datanode.*.address configuration options, however per HADOOP-6867 only the source address of the registration is exposed to clients. HADOOP-985 made clients access datanodes by IP primarily to avoid the latency of a DNS lookup, this had the side effect of breaking DN multihoming. In order to fix it let's add back the option for Datanodes to be accessed by hostname. This can be done by: # Modifying the primary field of the Datanode descriptor to be the hostname, or # Modifying Client/Datanode - Datanode access use the hostname field instead of the IP I'd like to go with approach #2 as it does not require making an incompatible change to the client protocol, and is much less invasive. It minimizes the scope of modification to just places where clients and Datanodes connect, vs changing all uses of Datanode identifiers. New client and Datanode configuration options are introduced: - {{dfs.client.use.datanode.hostname}} indicates all client to datanode connections should use the datanode hostname (as clients outside cluster may not be able to route the IP) - {{dfs.datanode.use.datanode.hostname}} indicates whether Datanodes should use hostnames when connecting to other Datanodes for data transfer If the configuration options are not used, there is no change in the current behavior. I'm doing something similar to #1 btw in trunk in HDFS-3144 - refactoring the use of DatanodeID to use the right field (IP, IP:xferPort, hostname, etc) based on the context the ID is being used in, vs always using the IP:xferPort as the Datanode's name, and using the name everywhere. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3000) Add a public API for setting quotas
[ https://issues.apache.org/jira/browse/HDFS-3000?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aaron T. Myers updated HDFS-3000: - Target Version/s: 2.0.0 (was: 0.23.2) Affects Version/s: (was: 0.23.1) 2.0.0 Status: Patch Available (was: Open) Add a public API for setting quotas --- Key: HDFS-3000 URL: https://issues.apache.org/jira/browse/HDFS-3000 Project: Hadoop HDFS Issue Type: Improvement Components: hdfs client Affects Versions: 2.0.0 Reporter: Aaron T. Myers Assignee: Aaron T. Myers Attachments: HDFS-3000.patch Currently one can set the quota of a file or directory from the command line, but if a user wants to set it programmatically, they need to use DistributedFileSystem, which is annotated InterfaceAudience.Private. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HDFS-3168) Clean up FSNamesystem and BlockManager
Clean up FSNamesystem and BlockManager -- Key: HDFS-3168 URL: https://issues.apache.org/jira/browse/HDFS-3168 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Tsz Wo (Nicholas), SZE Assignee: Tsz Wo (Nicholas), SZE -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3000) Add a public API for setting quotas
[ https://issues.apache.org/jira/browse/HDFS-3000?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aaron T. Myers updated HDFS-3000: - Attachment: HDFS-3000.patch Here's a patch which adds the HDFS administrative API as discussed. Please review. Add a public API for setting quotas --- Key: HDFS-3000 URL: https://issues.apache.org/jira/browse/HDFS-3000 Project: Hadoop HDFS Issue Type: Improvement Components: hdfs client Affects Versions: 2.0.0 Reporter: Aaron T. Myers Assignee: Aaron T. Myers Attachments: HDFS-3000.patch Currently one can set the quota of a file or directory from the command line, but if a user wants to set it programmatically, they need to use DistributedFileSystem, which is annotated InterfaceAudience.Private. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3000) Add a public API for setting quotas
[ https://issues.apache.org/jira/browse/HDFS-3000?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aaron T. Myers updated HDFS-3000: - Attachment: HDFS-3000.patch Right after posting the patch I noticed some goofy indentation and stale javadoc comments from an earlier version of the patch which declared more exception types. Here's an updated patch. Sorry for the noise. Add a public API for setting quotas --- Key: HDFS-3000 URL: https://issues.apache.org/jira/browse/HDFS-3000 Project: Hadoop HDFS Issue Type: Improvement Components: hdfs client Affects Versions: 2.0.0 Reporter: Aaron T. Myers Assignee: Aaron T. Myers Attachments: HDFS-3000.patch, HDFS-3000.patch Currently one can set the quota of a file or directory from the command line, but if a user wants to set it programmatically, they need to use DistributedFileSystem, which is annotated InterfaceAudience.Private. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3168) Clean up FSNamesystem and BlockManager
[ https://issues.apache.org/jira/browse/HDFS-3168?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo (Nicholas), SZE updated HDFS-3168: - Attachment: h3168_20120330.patch h3168_20120330.patch: - remove unnecessary throw IOException; - change fields to final; - remove DFSConfigKeys.DFS_NAMENODE_UPGRADE_PERMISSION_KEY. Clean up FSNamesystem and BlockManager -- Key: HDFS-3168 URL: https://issues.apache.org/jira/browse/HDFS-3168 Project: Hadoop HDFS Issue Type: Sub-task Components: name-node Reporter: Tsz Wo (Nicholas), SZE Assignee: Tsz Wo (Nicholas), SZE Attachments: h3168_20120330.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3168) Clean up FSNamesystem and BlockManager
[ https://issues.apache.org/jira/browse/HDFS-3168?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo (Nicholas), SZE updated HDFS-3168: - Status: Patch Available (was: Open) Clean up FSNamesystem and BlockManager -- Key: HDFS-3168 URL: https://issues.apache.org/jira/browse/HDFS-3168 Project: Hadoop HDFS Issue Type: Sub-task Components: name-node Reporter: Tsz Wo (Nicholas), SZE Assignee: Tsz Wo (Nicholas), SZE Attachments: h3168_20120330.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3167) CLI-based driver for MiniDFSCluster
[ https://issues.apache.org/jira/browse/HDFS-3167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13242898#comment-13242898 ] Aaron T. Myers commented on HDFS-3167: -- Hey Henry, patch looks pretty good after a quick look. I haven't actually run it, though. Can you comment on what testing you've done? A few little things I noticed: # Why do you initialize nameNodePort to 20500? I don't think it will actually get used, since you later specify 0 as the default. # It's not obvious to me why you output some error messages using LOG.info(...), and others using System.err.println(...). Unless there's some good reason, I'd suggest you either be consistent or add a comment explaining what the distinction for using one vs. the other is. # I don't see how stop(...) will ever be called. CLI-based driver for MiniDFSCluster --- Key: HDFS-3167 URL: https://issues.apache.org/jira/browse/HDFS-3167 Project: Hadoop HDFS Issue Type: New Feature Components: test Affects Versions: 2.0.0 Reporter: Henry Robinson Assignee: Henry Robinson Priority: Minor Attachments: HDFS-3167.patch Picking up a thread again from MAPREDUCE-987, I've found it very useful to have a CLI driver for running a single-process DFS cluster, particularly when developing features in HDFS clients. For example, being able to spin up a local cluster easily was tremendously useful for correctness testing of HDFS-2834. I'd like to contribute a class based on the patch for MAPREDUCE-987 we've been using fairly extensively. Only for DFS, not MR since much has changed MR-side since the original patch. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1599) Umbrella Jira for Improving HBASE support in HDFS
[ https://issues.apache.org/jira/browse/HDFS-1599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13242900#comment-13242900 ] Tsz Wo (Nicholas), SZE commented on HDFS-1599: -- Uma, you are right that we should have some public API for HBase and other projects. What are the other methods that HBase requires? Umbrella Jira for Improving HBASE support in HDFS - Key: HDFS-1599 URL: https://issues.apache.org/jira/browse/HDFS-1599 Project: Hadoop HDFS Issue Type: Improvement Reporter: Sanjay Radia Umbrella Jira for improved HBase support in HDFS -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3004) Implement Recovery Mode
[ https://issues.apache.org/jira/browse/HDFS-3004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13242899#comment-13242899 ] Hadoop QA commented on HDFS-3004: - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12520690/HDFS-3004.036.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 21 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 eclipse:eclipse. The patch built with eclipse:eclipse. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed unit tests in . +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/2132//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2132//console This message is automatically generated. Implement Recovery Mode --- Key: HDFS-3004 URL: https://issues.apache.org/jira/browse/HDFS-3004 Project: Hadoop HDFS Issue Type: New Feature Components: tools Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Attachments: HDFS-3004.010.patch, HDFS-3004.011.patch, HDFS-3004.012.patch, HDFS-3004.013.patch, HDFS-3004.015.patch, HDFS-3004.016.patch, HDFS-3004.017.patch, HDFS-3004.018.patch, HDFS-3004.019.patch, HDFS-3004.020.patch, HDFS-3004.022.patch, HDFS-3004.023.patch, HDFS-3004.024.patch, HDFS-3004.026.patch, HDFS-3004.027.patch, HDFS-3004.029.patch, HDFS-3004.030.patch, HDFS-3004.031.patch, HDFS-3004.032.patch, HDFS-3004.033.patch, HDFS-3004.034.patch, HDFS-3004.035.patch, HDFS-3004.036.patch, HDFS-3004__namenode_recovery_tool.txt When the NameNode metadata is corrupt for some reason, we want to be able to fix it. Obviously, we would prefer never to get in this case. In a perfect world, we never would. However, bad data on disk can happen from time to time, because of hardware errors or misconfigurations. In the past we have had to correct it manually, which is time-consuming and which can result in downtime. Recovery mode is initialized by the system administrator. When the NameNode starts up in Recovery Mode, it will try to load the FSImage file, apply all the edits from the edits log, and then write out a new image. Then it will shut down. Unlike in the normal startup process, the recovery mode startup process will be interactive. When the NameNode finds something that is inconsistent, it will prompt the operator as to what it should do. The operator can also choose to take the first option for all prompts by starting up with the '-f' flag, or typing 'a' at one of the prompts. I have reused as much code as possible from the NameNode in this tool. Hopefully, the effort that was spent developing this will also make the NameNode editLog and image processing even more robust than it already is. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1599) Umbrella Jira for Improving HBASE support in HDFS
[ https://issues.apache.org/jira/browse/HDFS-1599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13242907#comment-13242907 ] Todd Lipcon commented on HDFS-1599: --- Most of the reflection in HBase has to do with version compatibility, not accessing private APIs. Adding a new API on HDFS doesn't solve the problem, really, since the whole reason for the reflection is to compile against old versions which don't have the new APIs :) Umbrella Jira for Improving HBASE support in HDFS - Key: HDFS-1599 URL: https://issues.apache.org/jira/browse/HDFS-1599 Project: Hadoop HDFS Issue Type: Improvement Reporter: Sanjay Radia Umbrella Jira for improved HBase support in HDFS -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3044) fsck move should be non-destructive by default
[ https://issues.apache.org/jira/browse/HDFS-3044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13242909#comment-13242909 ] Eli Collins commented on HDFS-3044: --- +1 updated patch looks great fsck move should be non-destructive by default -- Key: HDFS-3044 URL: https://issues.apache.org/jira/browse/HDFS-3044 Project: Hadoop HDFS Issue Type: Improvement Components: name-node Reporter: Eli Collins Assignee: Colin Patrick McCabe Fix For: 1.1.0, 2.0.0 Attachments: HDFS-3044-b1.002.patch, HDFS-3044-b1.004.patch, HDFS-3044.002.patch, HDFS-3044.003.patch The fsck move behavior in the code and originally articulated in HADOOP-101 is: {quote}Current failure modes for DFS involve blocks that are completely missing. The only way to fix them would be to recover chains of blocks and put them into lost+found{quote} A directory is created with the file name, the blocks that are accessible are created as individual files in this directory, then the original file is removed. I suspect the rationale for this behavior was that you can't use files that are missing locations, and copying the block as files at least makes part of the files accessible. However this behavior can also result in permanent dataloss. Eg: - Some datanodes don't come up (eg due to a HW issues) and checkin on cluster startup, files with blocks where all replicas are on these set of datanodes are marked corrupt - Admin does fsck move, which deletes the corrupt files, saves whatever blocks were available - The HW issues with datanodes are resolved, they are started and join the cluster. The NN tells them to delete their blocks for the corrupt files since the file was deleted. I think we should: - Make fsck move non-destructive by default (eg just does a move into lost+found) - Make the destructive behavior optional (eg --destructive so admins think about what they're doing) - Provide better sanity checks and warnings, eg if you're running fsck and not all the slaves have checked in (if using dfs.hosts) then fsck should print a warning indicating this that an admin should have to override if they want to do something destructive -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3044) fsck move should be non-destructive by default
[ https://issues.apache.org/jira/browse/HDFS-3044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13242912#comment-13242912 ] Eli Collins commented on HDFS-3044: --- I've committed it to branch-1 fsck move should be non-destructive by default -- Key: HDFS-3044 URL: https://issues.apache.org/jira/browse/HDFS-3044 Project: Hadoop HDFS Issue Type: Improvement Components: name-node Reporter: Eli Collins Assignee: Colin Patrick McCabe Fix For: 1.1.0, 2.0.0 Attachments: HDFS-3044-b1.002.patch, HDFS-3044-b1.004.patch, HDFS-3044.002.patch, HDFS-3044.003.patch The fsck move behavior in the code and originally articulated in HADOOP-101 is: {quote}Current failure modes for DFS involve blocks that are completely missing. The only way to fix them would be to recover chains of blocks and put them into lost+found{quote} A directory is created with the file name, the blocks that are accessible are created as individual files in this directory, then the original file is removed. I suspect the rationale for this behavior was that you can't use files that are missing locations, and copying the block as files at least makes part of the files accessible. However this behavior can also result in permanent dataloss. Eg: - Some datanodes don't come up (eg due to a HW issues) and checkin on cluster startup, files with blocks where all replicas are on these set of datanodes are marked corrupt - Admin does fsck move, which deletes the corrupt files, saves whatever blocks were available - The HW issues with datanodes are resolved, they are started and join the cluster. The NN tells them to delete their blocks for the corrupt files since the file was deleted. I think we should: - Make fsck move non-destructive by default (eg just does a move into lost+found) - Make the destructive behavior optional (eg --destructive so admins think about what they're doing) - Provide better sanity checks and warnings, eg if you're running fsck and not all the slaves have checked in (if using dfs.hosts) then fsck should print a warning indicating this that an admin should have to override if they want to do something destructive -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3000) Add a public API for setting quotas
[ https://issues.apache.org/jira/browse/HDFS-3000?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13242924#comment-13242924 ] Hadoop QA commented on HDFS-3000: - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12520697/HDFS-3000.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 eclipse:eclipse. The patch built with eclipse:eclipse. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hdfs.TestLeaseRecovery2 +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/2133//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2133//console This message is automatically generated. Add a public API for setting quotas --- Key: HDFS-3000 URL: https://issues.apache.org/jira/browse/HDFS-3000 Project: Hadoop HDFS Issue Type: Improvement Components: hdfs client Affects Versions: 2.0.0 Reporter: Aaron T. Myers Assignee: Aaron T. Myers Attachments: HDFS-3000.patch, HDFS-3000.patch Currently one can set the quota of a file or directory from the command line, but if a user wants to set it programmatically, they need to use DistributedFileSystem, which is annotated InterfaceAudience.Private. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3000) Add a public API for setting quotas
[ https://issues.apache.org/jira/browse/HDFS-3000?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13242925#comment-13242925 ] Aaron T. Myers commented on HDFS-3000: -- I'm confident that the test failure is unrelated. Add a public API for setting quotas --- Key: HDFS-3000 URL: https://issues.apache.org/jira/browse/HDFS-3000 Project: Hadoop HDFS Issue Type: Improvement Components: hdfs client Affects Versions: 2.0.0 Reporter: Aaron T. Myers Assignee: Aaron T. Myers Attachments: HDFS-3000.patch, HDFS-3000.patch Currently one can set the quota of a file or directory from the command line, but if a user wants to set it programmatically, they need to use DistributedFileSystem, which is annotated InterfaceAudience.Private. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3000) Add a public API for setting quotas
[ https://issues.apache.org/jira/browse/HDFS-3000?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13242928#comment-13242928 ] Tsz Wo (Nicholas), SZE commented on HDFS-3000: -- - How about create DFSClient instead of using DistributedFileSystem? - How about create a new package org.apache.hadoop.hdfs.client for the API? Add a public API for setting quotas --- Key: HDFS-3000 URL: https://issues.apache.org/jira/browse/HDFS-3000 Project: Hadoop HDFS Issue Type: Improvement Components: hdfs client Affects Versions: 2.0.0 Reporter: Aaron T. Myers Assignee: Aaron T. Myers Attachments: HDFS-3000.patch, HDFS-3000.patch Currently one can set the quota of a file or directory from the command line, but if a user wants to set it programmatically, they need to use DistributedFileSystem, which is annotated InterfaceAudience.Private. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3000) Add a public API for setting quotas
[ https://issues.apache.org/jira/browse/HDFS-3000?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13242937#comment-13242937 ] Aaron T. Myers commented on HDFS-3000: -- bq. How about create DFSClient instead of using DistributedFileSystem? What's the benefit of doing this? bq. How about create a new package org.apache.hadoop.hdfs.client for the API? That seems like a good idea to me. I'll update the patch once you get back to me on the question above. Add a public API for setting quotas --- Key: HDFS-3000 URL: https://issues.apache.org/jira/browse/HDFS-3000 Project: Hadoop HDFS Issue Type: Improvement Components: hdfs client Affects Versions: 2.0.0 Reporter: Aaron T. Myers Assignee: Aaron T. Myers Attachments: HDFS-3000.patch, HDFS-3000.patch Currently one can set the quota of a file or directory from the command line, but if a user wants to set it programmatically, they need to use DistributedFileSystem, which is annotated InterfaceAudience.Private. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3168) Clean up FSNamesystem and BlockManager
[ https://issues.apache.org/jira/browse/HDFS-3168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13242935#comment-13242935 ] Hadoop QA commented on HDFS-3168: - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12520701/h3168_20120330.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 eclipse:eclipse. The patch built with eclipse:eclipse. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hdfs.TestFileAppend4 +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/2134//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2134//console This message is automatically generated. Clean up FSNamesystem and BlockManager -- Key: HDFS-3168 URL: https://issues.apache.org/jira/browse/HDFS-3168 Project: Hadoop HDFS Issue Type: Sub-task Components: name-node Reporter: Tsz Wo (Nicholas), SZE Assignee: Tsz Wo (Nicholas), SZE Attachments: h3168_20120330.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HDFS-3170) Add more useful metrics for write latency
Add more useful metrics for write latency - Key: HDFS-3170 URL: https://issues.apache.org/jira/browse/HDFS-3170 Project: Hadoop HDFS Issue Type: Improvement Components: data-node Affects Versions: 2.0.0 Reporter: Todd Lipcon Currently, the only write-latency related metric we expose is the total amount of time taken by opWriteBlock. This is practically useless, since (a) different blocks may be wildly different sizes, and (b) if the writer is only generating data slowly, it will make a block write take longer by no fault of the DN. I would like to propose two new metrics: 1) *flush-to-disk time*: count how long it takes for each call to flush an incoming packet to disk (including the checksums). In most cases this will be close to 0, as it only flushes to buffer cache, but if the backing block device enters congested writeback, it can take much longer, which provides an interesting metric. 2) *round trip to downstream pipeline node*: track the round trip latency for the part of the pipeline between the local node and its downstream neighbors. When we add a new packet to the ack queue, save the current timestamp. When we receive an ack, update the metric based on how long since we sent the original packet. This gives a metric of the total RTT through the pipeline. If we also include this metric in the ack to upstream, we can subtract the amount of time due to the later stages in the pipeline and have an accurate count of this particular link. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3170) Add more useful metrics for write latency
[ https://issues.apache.org/jira/browse/HDFS-3170?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13242950#comment-13242950 ] Todd Lipcon commented on HDFS-3170: --- Another improvement would be to track the flush-to-disk time separately per data directory. This can help detect disks that are starting to go bad. Add more useful metrics for write latency - Key: HDFS-3170 URL: https://issues.apache.org/jira/browse/HDFS-3170 Project: Hadoop HDFS Issue Type: Improvement Components: data-node Affects Versions: 2.0.0 Reporter: Todd Lipcon Currently, the only write-latency related metric we expose is the total amount of time taken by opWriteBlock. This is practically useless, since (a) different blocks may be wildly different sizes, and (b) if the writer is only generating data slowly, it will make a block write take longer by no fault of the DN. I would like to propose two new metrics: 1) *flush-to-disk time*: count how long it takes for each call to flush an incoming packet to disk (including the checksums). In most cases this will be close to 0, as it only flushes to buffer cache, but if the backing block device enters congested writeback, it can take much longer, which provides an interesting metric. 2) *round trip to downstream pipeline node*: track the round trip latency for the part of the pipeline between the local node and its downstream neighbors. When we add a new packet to the ack queue, save the current timestamp. When we receive an ack, update the metric based on how long since we sent the original packet. This gives a metric of the total RTT through the pipeline. If we also include this metric in the ack to upstream, we can subtract the amount of time due to the later stages in the pipeline and have an accurate count of this particular link. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (HDFS-3070) hdfs balancer doesn't balance blocks between datanodes
[ https://issues.apache.org/jira/browse/HDFS-3070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aaron T. Myers reassigned HDFS-3070: Assignee: Aaron T. Myers hdfs balancer doesn't balance blocks between datanodes -- Key: HDFS-3070 URL: https://issues.apache.org/jira/browse/HDFS-3070 Project: Hadoop HDFS Issue Type: Bug Components: balancer Affects Versions: 0.24.0 Reporter: Stephen Chu Assignee: Aaron T. Myers Attachments: unbalanced_nodes.png, unbalanced_nodes_inservice.png I TeraGenerated data into DataNodes styx01 and styx02. Looking at the web UI, both have over 3% disk usage. Attached is a screenshot of the Live Nodes web UI. On styx01, I run the _hdfs balancer_ command with threshold 1% and don't see the blocks being balanced across all 4 datanodes (all blocks on styx01 and styx02 stay put). HA is currently enabled. [schu@styx01 ~]$ hdfs haadmin -getServiceState nn1 active [schu@styx01 ~]$ hdfs balancer -threshold 1 12/03/08 10:10:32 INFO balancer.Balancer: Using a threshold of 1.0 12/03/08 10:10:32 INFO balancer.Balancer: namenodes = [] 12/03/08 10:10:32 INFO balancer.Balancer: p = Balancer.Parameters[BalancingPolicy.Node, threshold=1.0] Time Stamp Iteration# Bytes Already Moved Bytes Left To Move Bytes Being Moved Balancing took 95.0 milliseconds [schu@styx01 ~]$ I believe with a threshold of 1% the balancer should trigger blocks being moved across DataNodes, right? I am curious about the namenode = [] from the above output. [schu@styx01 ~]$ hadoop version Hadoop 0.24.0-SNAPSHOT Subversion git://styx01.sf.cloudera.com/home/schu/hadoop-common/hadoop-common-project/hadoop-common -r f6a577d697bbcd04ffbc568167c97b79479ff319 Compiled by schu on Thu Mar 8 15:32:50 PST 2012 From source with checksum ec971a6e7316f7fbf471b617905856b8 From http://hadoop.apache.org/hdfs/docs/r0.21.0/api/org/apache/hadoop/hdfs/server/balancer/Balancer.html: The threshold parameter is a fraction in the range of (0%, 100%) with a default value of 10%. The threshold sets a target for whether the cluster is balanced. A cluster is balanced if for each datanode, the utilization of the node (ratio of used space at the node to total capacity of the node) differs from the utilization of the (ratio of used space in the cluster to total capacity of the cluster) by no more than the threshold value. The smaller the threshold, the more balanced a cluster will become. It takes more time to run the balancer for small threshold values. Also for a very small threshold the cluster may not be able to reach the balanced state when applications write and delete files concurrently. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3070) hdfs balancer doesn't balance blocks between datanodes
[ https://issues.apache.org/jira/browse/HDFS-3070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aaron T. Myers updated HDFS-3070: - Attachment: HDFS-3070.patch Sigh. Looks like this problem is the classic hdfs-site.xml happens to never get loaded because HdfsConfiguration is never statically initialized in the JVM issue. The tests don't catch this because MiniDFSCluster sets up the configuration explicitly, without hdfs-site.xml having to get loaded. Here's a patch which addresses the issue. I tested this manually and confirmed that without the fix, the balancer won't run, but with the fix it runs just fine. Sample output: {noformat} 12/03/30 19:06:08 INFO balancer.Balancer: namenodes = [hdfs://ha-nn-uri] 12/03/30 19:06:08 INFO balancer.Balancer: p = Balancer.Parameters[BalancingPolicy.Node, threshold=10.0] Time Stamp Iteration# Bytes Already Moved Bytes Left To Move Bytes Being Moved 12/03/30 19:06:09 INFO net.NetworkTopology: Adding a new node: /default-rack/172.29.20.100:50010 12/03/30 19:06:09 INFO balancer.Balancer: 0 over-utilized: [] 12/03/30 19:06:09 INFO balancer.Balancer: 0 underutilized: [] The cluster is balanced. Exiting... Balancing took 1.255 seconds {noformat} hdfs balancer doesn't balance blocks between datanodes -- Key: HDFS-3070 URL: https://issues.apache.org/jira/browse/HDFS-3070 Project: Hadoop HDFS Issue Type: Bug Components: balancer Affects Versions: 2.0.0 Reporter: Stephen Chu Assignee: Aaron T. Myers Attachments: HDFS-3070.patch, unbalanced_nodes.png, unbalanced_nodes_inservice.png I TeraGenerated data into DataNodes styx01 and styx02. Looking at the web UI, both have over 3% disk usage. Attached is a screenshot of the Live Nodes web UI. On styx01, I run the _hdfs balancer_ command with threshold 1% and don't see the blocks being balanced across all 4 datanodes (all blocks on styx01 and styx02 stay put). HA is currently enabled. [schu@styx01 ~]$ hdfs haadmin -getServiceState nn1 active [schu@styx01 ~]$ hdfs balancer -threshold 1 12/03/08 10:10:32 INFO balancer.Balancer: Using a threshold of 1.0 12/03/08 10:10:32 INFO balancer.Balancer: namenodes = [] 12/03/08 10:10:32 INFO balancer.Balancer: p = Balancer.Parameters[BalancingPolicy.Node, threshold=1.0] Time Stamp Iteration# Bytes Already Moved Bytes Left To Move Bytes Being Moved Balancing took 95.0 milliseconds [schu@styx01 ~]$ I believe with a threshold of 1% the balancer should trigger blocks being moved across DataNodes, right? I am curious about the namenode = [] from the above output. [schu@styx01 ~]$ hadoop version Hadoop 0.24.0-SNAPSHOT Subversion git://styx01.sf.cloudera.com/home/schu/hadoop-common/hadoop-common-project/hadoop-common -r f6a577d697bbcd04ffbc568167c97b79479ff319 Compiled by schu on Thu Mar 8 15:32:50 PST 2012 From source with checksum ec971a6e7316f7fbf471b617905856b8 From http://hadoop.apache.org/hdfs/docs/r0.21.0/api/org/apache/hadoop/hdfs/server/balancer/Balancer.html: The threshold parameter is a fraction in the range of (0%, 100%) with a default value of 10%. The threshold sets a target for whether the cluster is balanced. A cluster is balanced if for each datanode, the utilization of the node (ratio of used space at the node to total capacity of the node) differs from the utilization of the (ratio of used space in the cluster to total capacity of the cluster) by no more than the threshold value. The smaller the threshold, the more balanced a cluster will become. It takes more time to run the balancer for small threshold values. Also for a very small threshold the cluster may not be able to reach the balanced state when applications write and delete files concurrently. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3000) Add a public API for setting quotas
[ https://issues.apache.org/jira/browse/HDFS-3000?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13242953#comment-13242953 ] Tsz Wo (Nicholas), SZE commented on HDFS-3000: -- What's the benefit of doing this? DistributedFileSystem.setQuota(..) and other admin methods should be removed. They should be moved to the admin API. Add a public API for setting quotas --- Key: HDFS-3000 URL: https://issues.apache.org/jira/browse/HDFS-3000 Project: Hadoop HDFS Issue Type: Improvement Components: hdfs client Affects Versions: 2.0.0 Reporter: Aaron T. Myers Assignee: Aaron T. Myers Attachments: HDFS-3000.patch, HDFS-3000.patch Currently one can set the quota of a file or directory from the command line, but if a user wants to set it programmatically, they need to use DistributedFileSystem, which is annotated InterfaceAudience.Private. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3070) hdfs balancer doesn't balance blocks between datanodes
[ https://issues.apache.org/jira/browse/HDFS-3070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aaron T. Myers updated HDFS-3070: - Target Version/s: 2.0.0 (was: 0.23.3) Affects Version/s: (was: 0.24.0) 2.0.0 Status: Patch Available (was: Open) hdfs balancer doesn't balance blocks between datanodes -- Key: HDFS-3070 URL: https://issues.apache.org/jira/browse/HDFS-3070 Project: Hadoop HDFS Issue Type: Bug Components: balancer Affects Versions: 2.0.0 Reporter: Stephen Chu Assignee: Aaron T. Myers Attachments: HDFS-3070.patch, unbalanced_nodes.png, unbalanced_nodes_inservice.png I TeraGenerated data into DataNodes styx01 and styx02. Looking at the web UI, both have over 3% disk usage. Attached is a screenshot of the Live Nodes web UI. On styx01, I run the _hdfs balancer_ command with threshold 1% and don't see the blocks being balanced across all 4 datanodes (all blocks on styx01 and styx02 stay put). HA is currently enabled. [schu@styx01 ~]$ hdfs haadmin -getServiceState nn1 active [schu@styx01 ~]$ hdfs balancer -threshold 1 12/03/08 10:10:32 INFO balancer.Balancer: Using a threshold of 1.0 12/03/08 10:10:32 INFO balancer.Balancer: namenodes = [] 12/03/08 10:10:32 INFO balancer.Balancer: p = Balancer.Parameters[BalancingPolicy.Node, threshold=1.0] Time Stamp Iteration# Bytes Already Moved Bytes Left To Move Bytes Being Moved Balancing took 95.0 milliseconds [schu@styx01 ~]$ I believe with a threshold of 1% the balancer should trigger blocks being moved across DataNodes, right? I am curious about the namenode = [] from the above output. [schu@styx01 ~]$ hadoop version Hadoop 0.24.0-SNAPSHOT Subversion git://styx01.sf.cloudera.com/home/schu/hadoop-common/hadoop-common-project/hadoop-common -r f6a577d697bbcd04ffbc568167c97b79479ff319 Compiled by schu on Thu Mar 8 15:32:50 PST 2012 From source with checksum ec971a6e7316f7fbf471b617905856b8 From http://hadoop.apache.org/hdfs/docs/r0.21.0/api/org/apache/hadoop/hdfs/server/balancer/Balancer.html: The threshold parameter is a fraction in the range of (0%, 100%) with a default value of 10%. The threshold sets a target for whether the cluster is balanced. A cluster is balanced if for each datanode, the utilization of the node (ratio of used space at the node to total capacity of the node) differs from the utilization of the (ratio of used space in the cluster to total capacity of the cluster) by no more than the threshold value. The smaller the threshold, the more balanced a cluster will become. It takes more time to run the balancer for small threshold values. Also for a very small threshold the cluster may not be able to reach the balanced state when applications write and delete files concurrently. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3094) add -nonInteractive and -force option to namenode -format command
[ https://issues.apache.org/jira/browse/HDFS-3094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13242958#comment-13242958 ] Tsz Wo (Nicholas), SZE commented on HDFS-3094: -- +1 patch looks good. Todd, I will commit this if we don't hear your response. add -nonInteractive and -force option to namenode -format command - Key: HDFS-3094 URL: https://issues.apache.org/jira/browse/HDFS-3094 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 0.24.0, 1.0.2 Reporter: Arpit Gupta Assignee: Arpit Gupta Attachments: HDFS-3094.branch-1.0.patch, HDFS-3094.branch-1.0.patch, HDFS-3094.branch-1.0.patch, HDFS-3094.branch-1.0.patch, HDFS-3094.branch-1.0.patch, HDFS-3094.branch-1.0.patch, HDFS-3094.patch, HDFS-3094.patch, HDFS-3094.patch, HDFS-3094.patch Currently the bin/hadoop namenode -format prompts the user for a Y/N to setup the directories in the local file system. -force : namenode formats the directories without prompting -nonInterActive : namenode format will return with an exit code of 1 if the dir exists. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3070) hdfs balancer doesn't balance blocks between datanodes
[ https://issues.apache.org/jira/browse/HDFS-3070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13242987#comment-13242987 ] Hadoop QA commented on HDFS-3070: - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12520710/HDFS-3070.patch against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 eclipse:eclipse. The patch built with eclipse:eclipse. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed unit tests in . +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/2136//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2136//console This message is automatically generated. hdfs balancer doesn't balance blocks between datanodes -- Key: HDFS-3070 URL: https://issues.apache.org/jira/browse/HDFS-3070 Project: Hadoop HDFS Issue Type: Bug Components: balancer Affects Versions: 2.0.0 Reporter: Stephen Chu Assignee: Aaron T. Myers Attachments: HDFS-3070.patch, unbalanced_nodes.png, unbalanced_nodes_inservice.png I TeraGenerated data into DataNodes styx01 and styx02. Looking at the web UI, both have over 3% disk usage. Attached is a screenshot of the Live Nodes web UI. On styx01, I run the _hdfs balancer_ command with threshold 1% and don't see the blocks being balanced across all 4 datanodes (all blocks on styx01 and styx02 stay put). HA is currently enabled. [schu@styx01 ~]$ hdfs haadmin -getServiceState nn1 active [schu@styx01 ~]$ hdfs balancer -threshold 1 12/03/08 10:10:32 INFO balancer.Balancer: Using a threshold of 1.0 12/03/08 10:10:32 INFO balancer.Balancer: namenodes = [] 12/03/08 10:10:32 INFO balancer.Balancer: p = Balancer.Parameters[BalancingPolicy.Node, threshold=1.0] Time Stamp Iteration# Bytes Already Moved Bytes Left To Move Bytes Being Moved Balancing took 95.0 milliseconds [schu@styx01 ~]$ I believe with a threshold of 1% the balancer should trigger blocks being moved across DataNodes, right? I am curious about the namenode = [] from the above output. [schu@styx01 ~]$ hadoop version Hadoop 0.24.0-SNAPSHOT Subversion git://styx01.sf.cloudera.com/home/schu/hadoop-common/hadoop-common-project/hadoop-common -r f6a577d697bbcd04ffbc568167c97b79479ff319 Compiled by schu on Thu Mar 8 15:32:50 PST 2012 From source with checksum ec971a6e7316f7fbf471b617905856b8 From http://hadoop.apache.org/hdfs/docs/r0.21.0/api/org/apache/hadoop/hdfs/server/balancer/Balancer.html: The threshold parameter is a fraction in the range of (0%, 100%) with a default value of 10%. The threshold sets a target for whether the cluster is balanced. A cluster is balanced if for each datanode, the utilization of the node (ratio of used space at the node to total capacity of the node) differs from the utilization of the (ratio of used space in the cluster to total capacity of the cluster) by no more than the threshold value. The smaller the threshold, the more balanced a cluster will become. It takes more time to run the balancer for small threshold values. Also for a very small threshold the cluster may not be able to reach the balanced state when applications write and delete files concurrently. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3168) Clean up FSNamesystem and BlockManager
[ https://issues.apache.org/jira/browse/HDFS-3168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13242992#comment-13242992 ] Hadoop QA commented on HDFS-3168: - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12520701/h3168_20120330.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 eclipse:eclipse. The patch built with eclipse:eclipse. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed unit tests in . +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/2137//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2137//console This message is automatically generated. Clean up FSNamesystem and BlockManager -- Key: HDFS-3168 URL: https://issues.apache.org/jira/browse/HDFS-3168 Project: Hadoop HDFS Issue Type: Sub-task Components: name-node Reporter: Tsz Wo (Nicholas), SZE Assignee: Tsz Wo (Nicholas), SZE Attachments: h3168_20120330.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-2991) failure to load edits: ClassCastException
[ https://issues.apache.org/jira/browse/HDFS-2991?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Konstantin Shvachko updated HDFS-2991: -- Attachment: hdfs-2991-0.22.txt Here a patch for 0.22 branch. The test is a bit simpler than Todd's, but it fails without the patch and succeeds with. failure to load edits: ClassCastException - Key: HDFS-2991 URL: https://issues.apache.org/jira/browse/HDFS-2991 Project: Hadoop HDFS Issue Type: Bug Components: name-node Affects Versions: 0.24.0, 0.23.1 Reporter: Todd Lipcon Assignee: Todd Lipcon Priority: Blocker Fix For: 0.23.2 Attachments: hdfs-2991-0.22.txt, hdfs-2991.txt, hdfs-2991.txt, image-with-buggy-append.tgz In doing scale testing of trunk at r1291606, I hit the following: java.io.IOException: Error replaying edit log at offset 1354251 Recent opcode offsets: 1350014 1350176 1350312 1354251 at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:418) at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:93) at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:79) ... Caused by: java.lang.ClassCastException: org.apache.hadoop.hdfs.server.namenode.INodeFile cannot be cast to org.apache.hadoop.hdfs.server.namenode.INodeFileUnderConstruction at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:213) ... 13 more -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira