[jira] [Created] (HDFS-2141) Remove NameNode roles Active and Standby
Remove NameNode roles Active and Standby Key: HDFS-2141 URL: https://issues.apache.org/jira/browse/HDFS-2141 Project: Hadoop HDFS Issue Type: Sub-task Components: name-node Reporter: Suresh Srinivas Assignee: Suresh Srinivas In HDFS, following roles are supported in NameNodeRole: ACTIVE, BACKUP, CHECKPOINT and STANDBY. Active and Standby are the state of the NameNode. While Backup and CheckPoint are the name/role of the daemons that are started. This mixes up the run time state of NameNode with the daemon role. I propose changing the NameNodeRole to: NAMENODE, BACKUP, CHECKPOINT. HDFS-1974 will introduce the states active and standby to the daemon that is running in the role NAMENODE. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-2141) Remove NameNode roles Active and Standby
[ https://issues.apache.org/jira/browse/HDFS-2141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Suresh Srinivas updated HDFS-2141: -- Attachment: HDFS-2141.patch Changes: # Changed NameNodeRole#ACTIVE to NameNodeRole#NAMENODE. # Removed NameNodeRole#STANDBY. Remove NameNode roles Active and Standby Key: HDFS-2141 URL: https://issues.apache.org/jira/browse/HDFS-2141 Project: Hadoop HDFS Issue Type: Sub-task Components: name-node Reporter: Suresh Srinivas Assignee: Suresh Srinivas Attachments: HDFS-2141.patch In HDFS, following roles are supported in NameNodeRole: ACTIVE, BACKUP, CHECKPOINT and STANDBY. Active and Standby are the state of the NameNode. While Backup and CheckPoint are the name/role of the daemons that are started. This mixes up the run time state of NameNode with the daemon role. I propose changing the NameNodeRole to: NAMENODE, BACKUP, CHECKPOINT. HDFS-1974 will introduce the states active and standby to the daemon that is running in the role NAMENODE. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2134) Move DecommissionManager to block management
[ https://issues.apache.org/jira/browse/HDFS-2134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13063861#comment-13063861 ] Hudson commented on HDFS-2134: -- Integrated in Hadoop-Hdfs-trunk #722 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/722/]) HDFS-2134. Move DecommissionManager to the blockmanagement package. szetszwo : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1145393 Files : * /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java * /hadoop/common/trunk/hdfs/CHANGES.txt * /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/DecommissionManager.java * /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java * /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/blockmanagement/DecommissionManager.java * /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeManager.java Move DecommissionManager to block management Key: HDFS-2134 URL: https://issues.apache.org/jira/browse/HDFS-2134 Project: Hadoop HDFS Issue Type: Sub-task Components: name-node Affects Versions: 0.23.0 Reporter: Tsz Wo (Nicholas), SZE Assignee: Tsz Wo (Nicholas), SZE Fix For: 0.23.0 Attachments: h2134_20110706.patch, h2134_20110708.patch, h2134_20110711.patch Datanode management including {{DecommissionManager}} should belong to block management. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2132) Potential resource leak in EditLogFileOutputStream.close
[ https://issues.apache.org/jira/browse/HDFS-2132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13063862#comment-13063862 ] Hudson commented on HDFS-2132: -- Integrated in Hadoop-Hdfs-trunk #722 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/722/]) HDFS-2132. Potential resource leak in EditLogFileOutputStream.close. (atm) atm : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1145428 Files : * /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/EditLogFileOutputStream.java * /hadoop/common/trunk/hdfs/CHANGES.txt * /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestEditLogFileOutputStream.java Potential resource leak in EditLogFileOutputStream.close Key: HDFS-2132 URL: https://issues.apache.org/jira/browse/HDFS-2132 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 0.23.0 Reporter: Aaron T. Myers Assignee: Aaron T. Myers Fix For: 0.23.0 Attachments: hdfs-2132.0.patch, hdfs-2132.1.patch, hdfs-2132.2.patch, hdfs-2132.3.patch {{EditLogFileOutputStream.close(...)}} sequentially closes a series of underlying resources. If any of the calls to {{close()}} throw before the last one, the later resources will never be closed. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-395) DFS Scalability: Incremental block reports
[ https://issues.apache.org/jira/browse/HDFS-395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13063897#comment-13063897 ] Todd Lipcon commented on HDFS-395: -- Why do we need to defer the notification? Since they're immediately moved out of the current directory into the toBeDeleted dir, they are as good as deleted even if we crash and restart. DFS Scalability: Incremental block reports -- Key: HDFS-395 URL: https://issues.apache.org/jira/browse/HDFS-395 Project: Hadoop HDFS Issue Type: Sub-task Reporter: dhruba borthakur Assignee: dhruba borthakur Attachments: blockReportPeriod.patch, explicitDeleteAcks.patch I have a cluster that has 1800 datanodes. Each datanode has around 5 blocks and sends a block report to the namenode once every hour. This means that the namenode processes a block report once every 2 seconds. Each block report contains all blocks that the datanode currently hosts. This makes the namenode compare a huge number of blocks that practically remains the same between two consecutive reports. This wastes CPU on the namenode. The problem becomes worse when the number of datanodes increases. One proposal is to make succeeding block reports (after a successful send of a full block report) be incremental. This will make the namenode process only those blocks that were added/deleted in the last period. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-395) DFS Scalability: Incremental block reports
[ https://issues.apache.org/jira/browse/HDFS-395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13063979#comment-13063979 ] Tomasz Nykiel commented on HDFS-395: Todd, please correct me if I am wrong, but I don't see that this is happening in FSDataset.invalidate(). The previous patch was renaming the block and metafile files, and notifying the NN immediately (please see the discussion above). If the files were moved to toBeDeleted or were renamed, then, of course, we do not need to wait. DFS Scalability: Incremental block reports -- Key: HDFS-395 URL: https://issues.apache.org/jira/browse/HDFS-395 Project: Hadoop HDFS Issue Type: Sub-task Reporter: dhruba borthakur Assignee: dhruba borthakur Attachments: blockReportPeriod.patch, explicitDeleteAcks.patch I have a cluster that has 1800 datanodes. Each datanode has around 5 blocks and sends a block report to the namenode once every hour. This means that the namenode processes a block report once every 2 seconds. Each block report contains all blocks that the datanode currently hosts. This makes the namenode compare a huge number of blocks that practically remains the same between two consecutive reports. This wastes CPU on the namenode. The problem becomes worse when the number of datanodes increases. One proposal is to make succeeding block reports (after a successful send of a full block report) be incremental. This will make the namenode process only those blocks that were added/deleted in the last period. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1073) Simpler model for Namenode's fs Image and edit Logs
[ https://issues.apache.org/jira/browse/HDFS-1073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13063992#comment-13063992 ] Todd Lipcon commented on HDFS-1073: --- Hairong asked me to comment describing what testing I've done on the branch. Here's a summary: - Lots of new unit tests -- about 3600 lines of net new test code, ~1000 lines updated. Total of 56 new test cases by my grepping. - Stress testing of 2NNs: -- Case 1: Start one NN with two data dirs. Start two 2NNs configured with checkpoint period of 0 (checkpoint as fast as possible). Let it loop for several hours to make sure nothing crashes. -- Case 2: Start one NN with two data dirs, one of which is on a filesystem mounted on top of software RAID configured in faulty mode. Set the faulty RAID driver to throw an IO error every 10,000 reads. Start 2NN with checkpoint period 0, run for several minutes, making sure the injected IO errors are handled correctly. Eventually the ext3 filesystem ends up remounting itself as read-only. fsck and remount the filesystem while the NN is running, make sure it can be restored correctly -- Both of the above tests are run while a separate program with 10 threads pounds mkdirs and delete calls into the NN as fast as it can. - Stress testing of BN: - Start NN. Start load generator (spamming mkdirs and delete calls) - Start BN with checkpoint configured once a minute. - Periodically stop load generator, issue mkdirs on NN and BN and make sure results are identical. - Take md5sum of files in BN's name dir, NN's namedir - verify that MD5s match. - Resume load generation. The above testing yielded a couple of bugs which I then converted to functional tests to prevent regressions. Simpler model for Namenode's fs Image and edit Logs Key: HDFS-1073 URL: https://issues.apache.org/jira/browse/HDFS-1073 Project: Hadoop HDFS Issue Type: Improvement Reporter: Sanjay Radia Assignee: Todd Lipcon Attachments: hdfs-1073-editloading-algos.txt, hdfs-1073.txt, hdfs1073.pdf, hdfs1073.pdf, hdfs1073.pdf, hdfs1073.tex The naming and handling of NN's fsImage and edit logs can be significantly improved resulting simpler and more robust code. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (HDFS-1580) Add interface for generic Write Ahead Logging mechanisms
[ https://issues.apache.org/jira/browse/HDFS-1580?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jitendra Nath Pandey reassigned HDFS-1580: -- Assignee: Jitendra Nath Pandey Add interface for generic Write Ahead Logging mechanisms Key: HDFS-1580 URL: https://issues.apache.org/jira/browse/HDFS-1580 Project: Hadoop HDFS Issue Type: Improvement Reporter: Ivan Kelly Assignee: Jitendra Nath Pandey Fix For: Edit log branch (HDFS-1073) Attachments: EditlogInterface.1.pdf, EditlogInterface.2.pdf, EditlogInterface.3.pdf, HDFS-1580+1521.diff, HDFS-1580.diff, HDFS-1580.diff, HDFS-1580.diff, generic_wal_iface.pdf, generic_wal_iface.pdf, generic_wal_iface.pdf, generic_wal_iface.txt -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2054) BlockSender.sendChunk() prints ERROR for connection closures encountered during transferToFully()
[ https://issues.apache.org/jira/browse/HDFS-2054?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13064041#comment-13064041 ] stack commented on HDFS-2054: - +1 on patch. Nice comment. Small change. Will commit in next couple of hours unless objection. BlockSender.sendChunk() prints ERROR for connection closures encountered during transferToFully() -- Key: HDFS-2054 URL: https://issues.apache.org/jira/browse/HDFS-2054 Project: Hadoop HDFS Issue Type: Improvement Components: data-node Affects Versions: 0.22.0, 0.23.0 Reporter: Kihwal Lee Assignee: Kihwal Lee Priority: Minor Fix For: 0.22.0, 0.23.0 Attachments: HDFS-2054-1.patch, HDFS-2054-2.patch, HDFS-2054.patch, HDFS-2054.patch The addition of ERROR was part of HDFS-1527. In environments where clients tear down FSInputStream/connection before reaching the end of stream, this error message often pops up. Since these are not really errors and especially not the fault of data node, the message should be toned down at least. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2054) BlockSender.sendChunk() prints ERROR for connection closures encountered during transferToFully()
[ https://issues.apache.org/jira/browse/HDFS-2054?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13064054#comment-13064054 ] Jakob Homan commented on HDFS-2054: --- Since we're trying to remove the calls to StringifyException (HDFS-1977), can we do so with this patch as well? BlockSender.sendChunk() prints ERROR for connection closures encountered during transferToFully() -- Key: HDFS-2054 URL: https://issues.apache.org/jira/browse/HDFS-2054 Project: Hadoop HDFS Issue Type: Improvement Components: data-node Affects Versions: 0.22.0, 0.23.0 Reporter: Kihwal Lee Assignee: Kihwal Lee Priority: Minor Fix For: 0.22.0, 0.23.0 Attachments: HDFS-2054-1.patch, HDFS-2054-2.patch, HDFS-2054.patch, HDFS-2054.patch The addition of ERROR was part of HDFS-1527. In environments where clients tear down FSInputStream/connection before reaching the end of stream, this error message often pops up. Since these are not really errors and especially not the fault of data node, the message should be toned down at least. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2054) BlockSender.sendChunk() prints ERROR for connection closures encountered during transferToFully()
[ https://issues.apache.org/jira/browse/HDFS-2054?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13064064#comment-13064064 ] Kihwal Lee commented on HDFS-2054: -- I incoporated Jakob's comment and a new patch has been posted. BlockSender.sendChunk() prints ERROR for connection closures encountered during transferToFully() -- Key: HDFS-2054 URL: https://issues.apache.org/jira/browse/HDFS-2054 Project: Hadoop HDFS Issue Type: Improvement Components: data-node Affects Versions: 0.22.0, 0.23.0 Reporter: Kihwal Lee Assignee: Kihwal Lee Priority: Minor Fix For: 0.22.0, 0.23.0 Attachments: HDFS-2054-1.patch, HDFS-2054-2.patch, HDFS-2054.patch, HDFS-2054.patch, HDFS-2054.patch The addition of ERROR was part of HDFS-1527. In environments where clients tear down FSInputStream/connection before reaching the end of stream, this error message often pops up. Since these are not really errors and especially not the fault of data node, the message should be toned down at least. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-2054) BlockSender.sendChunk() prints ERROR for connection closures encountered during transferToFully()
[ https://issues.apache.org/jira/browse/HDFS-2054?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kihwal Lee updated HDFS-2054: - Attachment: HDFS-2054.patch BlockSender.sendChunk() prints ERROR for connection closures encountered during transferToFully() -- Key: HDFS-2054 URL: https://issues.apache.org/jira/browse/HDFS-2054 Project: Hadoop HDFS Issue Type: Improvement Components: data-node Affects Versions: 0.22.0, 0.23.0 Reporter: Kihwal Lee Assignee: Kihwal Lee Priority: Minor Fix For: 0.22.0, 0.23.0 Attachments: HDFS-2054-1.patch, HDFS-2054-2.patch, HDFS-2054.patch, HDFS-2054.patch, HDFS-2054.patch The addition of ERROR was part of HDFS-1527. In environments where clients tear down FSInputStream/connection before reaching the end of stream, this error message often pops up. Since these are not really errors and especially not the fault of data node, the message should be toned down at least. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1977) Stop using StringUtils.stringifyException()
[ https://issues.apache.org/jira/browse/HDFS-1977?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13064067#comment-13064067 ] Bharath Mundlapudi commented on HDFS-1977: -- Thank you all, I am attaching a patch which addresses Jitendra's comment. Stop using StringUtils.stringifyException() --- Key: HDFS-1977 URL: https://issues.apache.org/jira/browse/HDFS-1977 Project: Hadoop HDFS Issue Type: Improvement Reporter: Joey Echeverria Assignee: Bharath Mundlapudi Priority: Minor Attachments: HDFS-1977-1.patch, HDFS-1977-2.patch, HDFS-1977-3.patch The old version of the logging APIs didn't support logging stack traces by passing exceptions to the logging methods (e.g. Log.error()). A number of log statements make use of StringUtils.stringifyException() to get around the old behavior. It would be nice if this could get cleaned up to make use of the the logger's stack trace printing. This also gives users more control since you can configure how the stack traces are written to the logs. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-1977) Stop using StringUtils.stringifyException()
[ https://issues.apache.org/jira/browse/HDFS-1977?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bharath Mundlapudi updated HDFS-1977: - Attachment: HDFS-1977-4.patch Stop using StringUtils.stringifyException() --- Key: HDFS-1977 URL: https://issues.apache.org/jira/browse/HDFS-1977 Project: Hadoop HDFS Issue Type: Improvement Reporter: Joey Echeverria Assignee: Bharath Mundlapudi Priority: Minor Attachments: HDFS-1977-1.patch, HDFS-1977-2.patch, HDFS-1977-3.patch, HDFS-1977-4.patch The old version of the logging APIs didn't support logging stack traces by passing exceptions to the logging methods (e.g. Log.error()). A number of log statements make use of StringUtils.stringifyException() to get around the old behavior. It would be nice if this could get cleaned up to make use of the the logger's stack trace printing. This also gives users more control since you can configure how the stack traces are written to the logs. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-1580) Add interface for generic Write Ahead Logging mechanisms
[ https://issues.apache.org/jira/browse/HDFS-1580?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jitendra Nath Pandey updated HDFS-1580: --- Fix Version/s: (was: Edit log branch (HDFS-1073)) 0.23.0 Add interface for generic Write Ahead Logging mechanisms Key: HDFS-1580 URL: https://issues.apache.org/jira/browse/HDFS-1580 Project: Hadoop HDFS Issue Type: Improvement Reporter: Ivan Kelly Assignee: Jitendra Nath Pandey Fix For: 0.23.0 Attachments: EditlogInterface.1.pdf, EditlogInterface.2.pdf, EditlogInterface.3.pdf, HDFS-1580+1521.diff, HDFS-1580.diff, HDFS-1580.diff, HDFS-1580.diff, generic_wal_iface.pdf, generic_wal_iface.pdf, generic_wal_iface.pdf, generic_wal_iface.txt -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2114) re-commission of a decommissioned node does not delete excess replica
[ https://issues.apache.org/jira/browse/HDFS-2114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13064102#comment-13064102 ] John George commented on HDFS-2114: --- Can someone please review this patch? re-commission of a decommissioned node does not delete excess replica - Key: HDFS-2114 URL: https://issues.apache.org/jira/browse/HDFS-2114 Project: Hadoop HDFS Issue Type: Bug Reporter: John George Assignee: John George Attachments: HDFS-2114-2.patch, HDFS-2114-3.patch, HDFS-2114.patch If a decommissioned node is removed from the decommissioned list, namenode does not delete the excess replicas it created while the node was decommissioned. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2054) BlockSender.sendChunk() prints ERROR for connection closures encountered during transferToFully()
[ https://issues.apache.org/jira/browse/HDFS-2054?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13064104#comment-13064104 ] Hadoop QA commented on HDFS-2054: - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12486211/HDFS-2054.patch against trunk revision 1145428. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. +1 system test framework. The patch passed system test framework compile. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/909//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/909//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/909//console This message is automatically generated. BlockSender.sendChunk() prints ERROR for connection closures encountered during transferToFully() -- Key: HDFS-2054 URL: https://issues.apache.org/jira/browse/HDFS-2054 Project: Hadoop HDFS Issue Type: Improvement Components: data-node Affects Versions: 0.22.0, 0.23.0 Reporter: Kihwal Lee Assignee: Kihwal Lee Priority: Minor Fix For: 0.22.0, 0.23.0 Attachments: HDFS-2054-1.patch, HDFS-2054-2.patch, HDFS-2054.patch, HDFS-2054.patch, HDFS-2054.patch The addition of ERROR was part of HDFS-1527. In environments where clients tear down FSInputStream/connection before reaching the end of stream, this error message often pops up. Since these are not really errors and especially not the fault of data node, the message should be toned down at least. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1977) Stop using StringUtils.stringifyException()
[ https://issues.apache.org/jira/browse/HDFS-1977?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13064105#comment-13064105 ] Hadoop QA commented on HDFS-1977: - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12486212/HDFS-1977-4.patch against trunk revision 1145428. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. +1 system test framework. The patch passed system test framework compile. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/910//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/910//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/910//console This message is automatically generated. Stop using StringUtils.stringifyException() --- Key: HDFS-1977 URL: https://issues.apache.org/jira/browse/HDFS-1977 Project: Hadoop HDFS Issue Type: Improvement Reporter: Joey Echeverria Assignee: Bharath Mundlapudi Priority: Minor Attachments: HDFS-1977-1.patch, HDFS-1977-2.patch, HDFS-1977-3.patch, HDFS-1977-4.patch The old version of the logging APIs didn't support logging stack traces by passing exceptions to the logging methods (e.g. Log.error()). A number of log statements make use of StringUtils.stringifyException() to get around the old behavior. It would be nice if this could get cleaned up to make use of the the logger's stack trace printing. This also gives users more control since you can configure how the stack traces are written to the logs. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2054) BlockSender.sendChunk() prints ERROR for connection closures encountered during transferToFully()
[ https://issues.apache.org/jira/browse/HDFS-2054?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13064112#comment-13064112 ] Kihwal Lee commented on HDFS-2054: -- bq. -1 tests included. The patch doesn't appear to include any new or modified tests. The previous justification also applies to the new patch. BlockSender.sendChunk() prints ERROR for connection closures encountered during transferToFully() -- Key: HDFS-2054 URL: https://issues.apache.org/jira/browse/HDFS-2054 Project: Hadoop HDFS Issue Type: Improvement Components: data-node Affects Versions: 0.22.0, 0.23.0 Reporter: Kihwal Lee Assignee: Kihwal Lee Priority: Minor Fix For: 0.22.0, 0.23.0 Attachments: HDFS-2054-1.patch, HDFS-2054-2.patch, HDFS-2054.patch, HDFS-2054.patch, HDFS-2054.patch The addition of ERROR was part of HDFS-1527. In environments where clients tear down FSInputStream/connection before reaching the end of stream, this error message often pops up. Since these are not really errors and especially not the fault of data node, the message should be toned down at least. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1977) Stop using StringUtils.stringifyException()
[ https://issues.apache.org/jira/browse/HDFS-1977?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13064117#comment-13064117 ] Jitendra Nath Pandey commented on HDFS-1977: +1 for the patch. Stop using StringUtils.stringifyException() --- Key: HDFS-1977 URL: https://issues.apache.org/jira/browse/HDFS-1977 Project: Hadoop HDFS Issue Type: Improvement Reporter: Joey Echeverria Assignee: Bharath Mundlapudi Priority: Minor Attachments: HDFS-1977-1.patch, HDFS-1977-2.patch, HDFS-1977-3.patch, HDFS-1977-4.patch The old version of the logging APIs didn't support logging stack traces by passing exceptions to the logging methods (e.g. Log.error()). A number of log statements make use of StringUtils.stringifyException() to get around the old behavior. It would be nice if this could get cleaned up to make use of the the logger's stack trace printing. This also gives users more control since you can configure how the stack traces are written to the logs. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2054) BlockSender.sendChunk() prints ERROR for connection closures encountered during transferToFully()
[ https://issues.apache.org/jira/browse/HDFS-2054?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13064116#comment-13064116 ] Jakob Homan commented on HDFS-2054: --- +1 BlockSender.sendChunk() prints ERROR for connection closures encountered during transferToFully() -- Key: HDFS-2054 URL: https://issues.apache.org/jira/browse/HDFS-2054 Project: Hadoop HDFS Issue Type: Improvement Components: data-node Affects Versions: 0.22.0, 0.23.0 Reporter: Kihwal Lee Assignee: Kihwal Lee Priority: Minor Fix For: 0.22.0, 0.23.0 Attachments: HDFS-2054-1.patch, HDFS-2054-2.patch, HDFS-2054.patch, HDFS-2054.patch, HDFS-2054.patch The addition of ERROR was part of HDFS-1527. In environments where clients tear down FSInputStream/connection before reaching the end of stream, this error message often pops up. Since these are not really errors and especially not the fault of data node, the message should be toned down at least. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HDFS-2142) Namenode in trunk has much slower performance than Namenode in MR-279 branch
Namenode in trunk has much slower performance than Namenode in MR-279 branch Key: HDFS-2142 URL: https://issues.apache.org/jira/browse/HDFS-2142 Project: Hadoop HDFS Issue Type: Bug Components: name-node Affects Versions: 0.23.0 Reporter: Eric Payne I am measureing the performance of the namenode by running the org.apache.hadoop.fs.loadGenerator.LoadGenerator application. This application shows there is a very large slowdown in the processing of opens, writes, closes, and operations per second in the trunk when compared to the MR-279 branch There have been some race conditions and locking issues fixed in trunk, which is a very good thing because these race conditions were causing the namenode to crash under load conditions (see HDFS:1257). However, the slowdown to the namenode is considerable. I am still trying to verify which changes caused the slowdown. It was originally suggested that the HDFS:988 may have caused the slowdown, but I don't think it was the culprit. I have checked out and built from SVN 3 revisions previous to HDFS988 and they all have about the same performance. Here is my environment: Host0: namenode daemon Host1-9: simulate many datanodes using org.apache.hadoop.hdfs.DataNodeCluster LoadGenerator output on MR-279 branch: Average open execution time: 1.8496516782773909ms Average deletion execution time: 2.956340167046317ms Average create execution time: 3.725259427992913ms Average write_close execution time: 11.151860288534548ms Average operations per second: 1053.3ops/s LoadGenerator output on trunk: Average open execution time: 28.603515625ms Average deletion execution time: 32.20792079207921ms Average create execution time: 32.37326732673267ms Average write_close execution time: 82.84752475247525ms Average operations per second: 135.13ops/s -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-2054) BlockSender.sendChunk() prints ERROR for connection closures encountered during transferToFully()
[ https://issues.apache.org/jira/browse/HDFS-2054?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HDFS-2054: Resolution: Fixed Hadoop Flags: [Reviewed] Status: Resolved (was: Patch Available) Committed to 0.22 branch and to trunk. Thanks for the patch Kihwal (Nice reviewing lads, Todd and Jakob) BlockSender.sendChunk() prints ERROR for connection closures encountered during transferToFully() -- Key: HDFS-2054 URL: https://issues.apache.org/jira/browse/HDFS-2054 Project: Hadoop HDFS Issue Type: Improvement Components: data-node Affects Versions: 0.22.0, 0.23.0 Reporter: Kihwal Lee Assignee: Kihwal Lee Priority: Minor Fix For: 0.22.0, 0.23.0 Attachments: HDFS-2054-1.patch, HDFS-2054-2.patch, HDFS-2054.patch, HDFS-2054.patch, HDFS-2054.patch The addition of ERROR was part of HDFS-1527. In environments where clients tear down FSInputStream/connection before reaching the end of stream, this error message often pops up. Since these are not really errors and especially not the fault of data node, the message should be toned down at least. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2054) BlockSender.sendChunk() prints ERROR for connection closures encountered during transferToFully()
[ https://issues.apache.org/jira/browse/HDFS-2054?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13064129#comment-13064129 ] Hudson commented on HDFS-2054: -- Integrated in Hadoop-Hdfs-trunk-Commit #782 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/782/]) HDFS-2054 BlockSender.sendChunk() prints ERROR for connection closures encountered during transferToFully() -- moved the change notice into 0.22 section (i'd originally committed it in trunk section) HDFS-2054 BlockSender.sendChunk() prints ERROR for connection closures encountered during transferToFully() stack : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1145753 Files : * /hadoop/common/trunk/hdfs/CHANGES.txt stack : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1145751 Files : * /hadoop/common/trunk/hdfs/CHANGES.txt * /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/datanode/BlockSender.java BlockSender.sendChunk() prints ERROR for connection closures encountered during transferToFully() -- Key: HDFS-2054 URL: https://issues.apache.org/jira/browse/HDFS-2054 Project: Hadoop HDFS Issue Type: Improvement Components: data-node Affects Versions: 0.22.0, 0.23.0 Reporter: Kihwal Lee Assignee: Kihwal Lee Priority: Minor Fix For: 0.22.0, 0.23.0 Attachments: HDFS-2054-1.patch, HDFS-2054-2.patch, HDFS-2054.patch, HDFS-2054.patch, HDFS-2054.patch The addition of ERROR was part of HDFS-1527. In environments where clients tear down FSInputStream/connection before reaching the end of stream, this error message often pops up. Since these are not really errors and especially not the fault of data node, the message should be toned down at least. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-347) DFS read performance suboptimal when client co-located on nodes with data
[ https://issues.apache.org/jira/browse/HDFS-347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13064130#comment-13064130 ] Nathan Roberts commented on HDFS-347: - With the work on hdfs-2080, I'd be really curious to see a benchmark with/without HDFS-347. With some of the other bottlenecks (HDFS-941 and HDFS-2080) out of the way, we'd be close to an apples-apples comparison. DFS read performance suboptimal when client co-located on nodes with data - Key: HDFS-347 URL: https://issues.apache.org/jira/browse/HDFS-347 Project: Hadoop HDFS Issue Type: Improvement Reporter: George Porter Assignee: Todd Lipcon Attachments: BlockReaderLocal1.txt, HADOOP-4801.1.patch, HADOOP-4801.2.patch, HADOOP-4801.3.patch, HDFS-347-branch-20-append.txt, all.tsv, hdfs-347.png, hdfs-347.txt, local-reads-doc One of the major strategies Hadoop uses to get scalable data processing is to move the code to the data. However, putting the DFS client on the same physical node as the data blocks it acts on doesn't improve read performance as much as expected. After looking at Hadoop and O/S traces (via HADOOP-4049), I think the problem is due to the HDFS streaming protocol causing many more read I/O operations (iops) than necessary. Consider the case of a DFSClient fetching a 64 MB disk block from the DataNode process (running in a separate JVM) running on the same machine. The DataNode will satisfy the single disk block request by sending data back to the HDFS client in 64-KB chunks. In BlockSender.java, this is done in the sendChunk() method, relying on Java's transferTo() method. Depending on the host O/S and JVM implementation, transferTo() is implemented as either a sendfilev() syscall or a pair of mmap() and write(). In either case, each chunk is read from the disk by issuing a separate I/O operation for each chunk. The result is that the single request for a 64-MB block ends up hitting the disk as over a thousand smaller requests for 64-KB each. Since the DFSClient runs in a different JVM and process than the DataNode, shuttling data from the disk to the DFSClient also results in context switches each time network packets get sent (in this case, the 64-kb chunk turns into a large number of 1500 byte packet send operations). Thus we see a large number of context switches for each block send operation. I'd like to get some feedback on the best way to address this, but I think providing a mechanism for a DFSClient to directly open data blocks that happen to be on the same machine. It could do this by examining the set of LocatedBlocks returned by the NameNode, marking those that should be resident on the local host. Since the DataNode and DFSClient (probably) share the same hadoop configuration, the DFSClient should be able to find the files holding the block data, and it could directly open them and send data back to the client. This would avoid the context switches imposed by the network layer, and would allow for much larger read buffers than 64KB, which should reduce the number of iops imposed by each read block operation. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HDFS-2143) dfsclusterhealth: we should link to the live nodes and dead nodes
dfsclusterhealth: we should link to the live nodes and dead nodes - Key: HDFS-2143 URL: https://issues.apache.org/jira/browse/HDFS-2143 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 0.22.0 Reporter: Ravi Prakash Assignee: Ravi Prakash The dfsclusterhealth page shows the number of live and dead nodes. It would be nice to link those numbers to the page containing the list of those nodes -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-2143) dfsclusterhealth: we should link to the live nodes and dead nodes
[ https://issues.apache.org/jira/browse/HDFS-2143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ravi Prakash updated HDFS-2143: --- Attachment: HDFS-2143.1.patch No tests included because I could not find a framework for testing the web interface. Please review and commit! dfsclusterhealth: we should link to the live nodes and dead nodes - Key: HDFS-2143 URL: https://issues.apache.org/jira/browse/HDFS-2143 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 0.22.0 Reporter: Ravi Prakash Assignee: Ravi Prakash Attachments: HDFS-2143.1.patch The dfsclusterhealth page shows the number of live and dead nodes. It would be nice to link those numbers to the page containing the list of those nodes -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-2143) dfsclusterhealth: we should link to the live nodes and dead nodes
[ https://issues.apache.org/jira/browse/HDFS-2143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ravi Prakash updated HDFS-2143: --- Status: Patch Available (was: Open) dfsclusterhealth: we should link to the live nodes and dead nodes - Key: HDFS-2143 URL: https://issues.apache.org/jira/browse/HDFS-2143 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 0.22.0 Reporter: Ravi Prakash Assignee: Ravi Prakash Attachments: HDFS-2143.1.patch The dfsclusterhealth page shows the number of live and dead nodes. It would be nice to link those numbers to the page containing the list of those nodes -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-347) DFS read performance suboptimal when client co-located on nodes with data
[ https://issues.apache.org/jira/browse/HDFS-347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13064158#comment-13064158 ] dhruba borthakur commented on HDFS-347: --- My observation has been that it is the high CPU usage on the datanodes that was causing performance degradation while doing random reads from HDFS (local block). I have 400 threads in hbase that are doing random reads from a bunch of files in HDFS. DFS read performance suboptimal when client co-located on nodes with data - Key: HDFS-347 URL: https://issues.apache.org/jira/browse/HDFS-347 Project: Hadoop HDFS Issue Type: Improvement Reporter: George Porter Assignee: Todd Lipcon Attachments: BlockReaderLocal1.txt, HADOOP-4801.1.patch, HADOOP-4801.2.patch, HADOOP-4801.3.patch, HDFS-347-branch-20-append.txt, all.tsv, hdfs-347.png, hdfs-347.txt, local-reads-doc One of the major strategies Hadoop uses to get scalable data processing is to move the code to the data. However, putting the DFS client on the same physical node as the data blocks it acts on doesn't improve read performance as much as expected. After looking at Hadoop and O/S traces (via HADOOP-4049), I think the problem is due to the HDFS streaming protocol causing many more read I/O operations (iops) than necessary. Consider the case of a DFSClient fetching a 64 MB disk block from the DataNode process (running in a separate JVM) running on the same machine. The DataNode will satisfy the single disk block request by sending data back to the HDFS client in 64-KB chunks. In BlockSender.java, this is done in the sendChunk() method, relying on Java's transferTo() method. Depending on the host O/S and JVM implementation, transferTo() is implemented as either a sendfilev() syscall or a pair of mmap() and write(). In either case, each chunk is read from the disk by issuing a separate I/O operation for each chunk. The result is that the single request for a 64-MB block ends up hitting the disk as over a thousand smaller requests for 64-KB each. Since the DFSClient runs in a different JVM and process than the DataNode, shuttling data from the disk to the DFSClient also results in context switches each time network packets get sent (in this case, the 64-kb chunk turns into a large number of 1500 byte packet send operations). Thus we see a large number of context switches for each block send operation. I'd like to get some feedback on the best way to address this, but I think providing a mechanism for a DFSClient to directly open data blocks that happen to be on the same machine. It could do this by examining the set of LocatedBlocks returned by the NameNode, marking those that should be resident on the local host. Since the DataNode and DFSClient (probably) share the same hadoop configuration, the DFSClient should be able to find the files holding the block data, and it could directly open them and send data back to the client. This would avoid the context switches imposed by the network layer, and would allow for much larger read buffers than 64KB, which should reduce the number of iops imposed by each read block operation. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1872) BPOfferService.cleanUp(..) throws NullPointerException
[ https://issues.apache.org/jira/browse/HDFS-1872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13064161#comment-13064161 ] Bharath Mundlapudi commented on HDFS-1872: -- Yes, I was seeing NPE in cleanup code earlier. I made some changes in this area related to datanode exit. It should be fine now. BPOfferService.cleanUp(..) throws NullPointerException -- Key: HDFS-1872 URL: https://issues.apache.org/jira/browse/HDFS-1872 Project: Hadoop HDFS Issue Type: Bug Components: data-node Reporter: Tsz Wo (Nicholas), SZE {noformat} NullPointerException at org.apache.hadoop.hdfs.server.datanode.DataNode$BPOfferService.cleanUp(DataNode.java:1005) at org.apache.hadoop.hdfs.server.datanode.DataNode$BPOfferService.run(DataNode.java:1220) at java.lang.Thread.run(Thread.java:662) {noformat} -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2143) dfsclusterhealth: we should link to the live nodes and dead nodes
[ https://issues.apache.org/jira/browse/HDFS-2143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13064166#comment-13064166 ] Ravi Prakash commented on HDFS-2143: This issue was filed by Arpit Gupta {quote} Currently on the dfsclusterhealth.jsp we show how many nodes are live and dead. It would be nice to have link to the page which would show the list of the nodes. Rather than user having to click on the namenode and then select the live or dead nodes link if the link is present on the dfsclusterhealth page then it saves the user an extra click {quote} dfsclusterhealth: we should link to the live nodes and dead nodes - Key: HDFS-2143 URL: https://issues.apache.org/jira/browse/HDFS-2143 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 0.22.0 Reporter: Ravi Prakash Assignee: Ravi Prakash Attachments: HDFS-2143.1.patch The dfsclusterhealth page shows the number of live and dead nodes. It would be nice to link those numbers to the page containing the list of those nodes -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HDFS-2144) SNN shuts down because of auth issues but does not log it
SNN shuts down because of auth issues but does not log it - Key: HDFS-2144 URL: https://issues.apache.org/jira/browse/HDFS-2144 Project: Hadoop HDFS Issue Type: Improvement Reporter: Ravi Prakash Assignee: Ravi Prakash SNN should log messages when it shuts down because of authentication issues. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2144) SNN shuts down because of auth issues but does not log it
[ https://issues.apache.org/jira/browse/HDFS-2144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13064174#comment-13064174 ] Ravi Prakash commented on HDFS-2144: This was again discovered by Arpit Gupta! :) {quote} Here is the info from the log 2011-05-20 22:08:41,806 INFO org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode: STARTUP_MSG: / STARTUP_MSG: Starting SecondaryNameNode STARTUP_MSG: host = localhost/127.0.0.1 STARTUP_MSG: args = [] STARTUP_MSG: version = 0.22.0.1105190202 STARTUP_MSG: classpath = .. .. STARTUP_MSG: build = git://git.apache.org -r e46832994ee3402918135d1a5bd0b21ed58f5ed0; compiled by 'myusername' on Thu May 19 02:11:11 PDT 2011 / 2011-05-20 22:08:42,119 INFO org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode: SHUTDOWN_MSG: / Now this happened because the snn was setup to be behind a virtual host which fails as the 203.3 patch is yet to be merged. The log should have some message that states the error {quote} SNN shuts down because of auth issues but does not log it - Key: HDFS-2144 URL: https://issues.apache.org/jira/browse/HDFS-2144 Project: Hadoop HDFS Issue Type: Improvement Reporter: Ravi Prakash Assignee: Ravi Prakash SNN should log messages when it shuts down because of authentication issues. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-347) DFS read performance suboptimal when client co-located on nodes with data
[ https://issues.apache.org/jira/browse/HDFS-347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13064199#comment-13064199 ] Nathan Roberts commented on HDFS-347: - Do you run with HDFS-941? How many random reads per second are you hitting hdfs with? iirc we see close to 10K 64K reads per second with ~2 cores. DFS read performance suboptimal when client co-located on nodes with data - Key: HDFS-347 URL: https://issues.apache.org/jira/browse/HDFS-347 Project: Hadoop HDFS Issue Type: Improvement Reporter: George Porter Assignee: Todd Lipcon Attachments: BlockReaderLocal1.txt, HADOOP-4801.1.patch, HADOOP-4801.2.patch, HADOOP-4801.3.patch, HDFS-347-branch-20-append.txt, all.tsv, hdfs-347.png, hdfs-347.txt, local-reads-doc One of the major strategies Hadoop uses to get scalable data processing is to move the code to the data. However, putting the DFS client on the same physical node as the data blocks it acts on doesn't improve read performance as much as expected. After looking at Hadoop and O/S traces (via HADOOP-4049), I think the problem is due to the HDFS streaming protocol causing many more read I/O operations (iops) than necessary. Consider the case of a DFSClient fetching a 64 MB disk block from the DataNode process (running in a separate JVM) running on the same machine. The DataNode will satisfy the single disk block request by sending data back to the HDFS client in 64-KB chunks. In BlockSender.java, this is done in the sendChunk() method, relying on Java's transferTo() method. Depending on the host O/S and JVM implementation, transferTo() is implemented as either a sendfilev() syscall or a pair of mmap() and write(). In either case, each chunk is read from the disk by issuing a separate I/O operation for each chunk. The result is that the single request for a 64-MB block ends up hitting the disk as over a thousand smaller requests for 64-KB each. Since the DFSClient runs in a different JVM and process than the DataNode, shuttling data from the disk to the DFSClient also results in context switches each time network packets get sent (in this case, the 64-kb chunk turns into a large number of 1500 byte packet send operations). Thus we see a large number of context switches for each block send operation. I'd like to get some feedback on the best way to address this, but I think providing a mechanism for a DFSClient to directly open data blocks that happen to be on the same machine. It could do this by examining the set of LocatedBlocks returned by the NameNode, marking those that should be resident on the local host. Since the DataNode and DFSClient (probably) share the same hadoop configuration, the DFSClient should be able to find the files holding the block data, and it could directly open them and send data back to the client. This would avoid the context switches imposed by the network layer, and would allow for much larger read buffers than 64KB, which should reduce the number of iops imposed by each read block operation. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2120) on reconnect, DN can connect to NN even with different source versions
[ https://issues.apache.org/jira/browse/HDFS-2120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13064218#comment-13064218 ] Aaron T. Myers commented on HDFS-2120: -- Hey John, patch looks pretty good. A few small comments: # Make sure there's a space between if and the open parenthesis. # No need for the space after ! in if (! nsver.equals(stver)) # I think the variable names nsver and stver could be more descriptive. Or, at least, they should be camel cased (capitalize the v.) # Would it really be so difficult to write a test for this? Seems like it wouldn't be too tough to use Mockito to mock out bpNamenode.versionRequest() to return some object with a different version number from the actual. on reconnect, DN can connect to NN even with different source versions -- Key: HDFS-2120 URL: https://issues.apache.org/jira/browse/HDFS-2120 Project: Hadoop HDFS Issue Type: Bug Reporter: John George Assignee: John George Attachments: HDFS-2120.patch DN or NN does not check for source versions in cases when NN goes away or has lost connection. The only check that is done by NN is for LAYOUT_VERSION and DN does not check for any version mismatch. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2143) dfsclusterhealth: we should link to the live nodes and dead nodes
[ https://issues.apache.org/jira/browse/HDFS-2143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13064219#comment-13064219 ] Hadoop QA commented on HDFS-2143: - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12486228/HDFS-2143.1.patch against trunk revision 1145753. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. +1 system test framework. The patch passed system test framework compile. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/911//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/911//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/911//console This message is automatically generated. dfsclusterhealth: we should link to the live nodes and dead nodes - Key: HDFS-2143 URL: https://issues.apache.org/jira/browse/HDFS-2143 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 0.22.0 Reporter: Ravi Prakash Assignee: Ravi Prakash Attachments: HDFS-2143.1.patch The dfsclusterhealth page shows the number of live and dead nodes. It would be nice to link those numbers to the page containing the list of those nodes -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1073) Simpler model for Namenode's fs Image and edit Logs
[ https://issues.apache.org/jira/browse/HDFS-1073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13064221#comment-13064221 ] Jitendra Nath Pandey commented on HDFS-1073: A few comments: 1. EditLogFileInputStream doesn't have any change except for an unused import. 2. EditLogOutputStream.java : abstract void write(byte[] data, int i, int length) All transactions should have a txid, therefore this write method is confusing. I guess it would be cleaned up with backup node fix. Please change the parameter name 'i' to offset. 3. FSEditLog.java: What is the reason to persist start and end of log segments? Do we really need OP_START_LOG_SEGMENT and OP_END_LOG_SEGMENT? 4. FSEditLogOp.java - LogHeader has a read method but not a write. Will it make sense to encapsulate both read and write of the header in the same class? 5. NNStorage.java - writeTransactionIdFileToStorage: The transaction id will be persisted along with the image and log files. For a running namenode, it will be in the in-memory state. It is not clear to me why do we need to persist a txid marker separately. 6. There are unused imports in a few files. 7. I have a few concerns related to FSImageTransactionalStorageInspector, FSEditLogLoader, but those parts have been addressed in HDFS-2018. I recommend to commit HDFS-2018 in the branch as it significantly improves some parts of the code. Simpler model for Namenode's fs Image and edit Logs Key: HDFS-1073 URL: https://issues.apache.org/jira/browse/HDFS-1073 Project: Hadoop HDFS Issue Type: Improvement Reporter: Sanjay Radia Assignee: Todd Lipcon Attachments: hdfs-1073-editloading-algos.txt, hdfs-1073.txt, hdfs1073.pdf, hdfs1073.pdf, hdfs1073.pdf, hdfs1073.tex The naming and handling of NN's fsImage and edit logs can be significantly improved resulting simpler and more robust code. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2038) Update test to handle relative paths with globs
[ https://issues.apache.org/jira/browse/HDFS-2038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13064228#comment-13064228 ] Aaron T. Myers commented on HDFS-2038: -- For some reason the pre-commit build for this latest patch was aborted by Hudson. I just kicked off another build. Update test to handle relative paths with globs --- Key: HDFS-2038 URL: https://issues.apache.org/jira/browse/HDFS-2038 Project: Hadoop HDFS Issue Type: Test Components: test Affects Versions: 0.23.0 Reporter: Daryn Sharp Assignee: Daryn Sharp Attachments: HDFS-2038-2.patch, HDFS-2038.patch This is providing the test updates for FsShell to retain relativity for paths with globs. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2054) BlockSender.sendChunk() prints ERROR for connection closures encountered during transferToFully()
[ https://issues.apache.org/jira/browse/HDFS-2054?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13064235#comment-13064235 ] Hudson commented on HDFS-2054: -- Integrated in Hadoop-Hdfs-22-branch #71 (See [https://builds.apache.org/job/Hadoop-Hdfs-22-branch/71/]) HDFS-2054 BlockSender.sendChunk() prints ERROR for connection closures encountered during transferToFully() stack : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1145752 Files : * /hadoop/common/branches/branch-0.22/hdfs/CHANGES.txt * /hadoop/common/branches/branch-0.22/hdfs/src/java/org/apache/hadoop/hdfs/server/datanode/BlockSender.java BlockSender.sendChunk() prints ERROR for connection closures encountered during transferToFully() -- Key: HDFS-2054 URL: https://issues.apache.org/jira/browse/HDFS-2054 Project: Hadoop HDFS Issue Type: Improvement Components: data-node Affects Versions: 0.22.0, 0.23.0 Reporter: Kihwal Lee Assignee: Kihwal Lee Priority: Minor Fix For: 0.22.0, 0.23.0 Attachments: HDFS-2054-1.patch, HDFS-2054-2.patch, HDFS-2054.patch, HDFS-2054.patch, HDFS-2054.patch The addition of ERROR was part of HDFS-1527. In environments where clients tear down FSInputStream/connection before reaching the end of stream, this error message often pops up. Since these are not really errors and especially not the fault of data node, the message should be toned down at least. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2038) Update test to handle relative paths with globs
[ https://issues.apache.org/jira/browse/HDFS-2038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13064248#comment-13064248 ] Hadoop QA commented on HDFS-2038: - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12486100/HDFS-2038-2.patch against trunk revision 1145753. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 262 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these core unit tests: org.apache.hadoop.cli.TestHDFSCLI +1 contrib tests. The patch passed contrib unit tests. +1 system test framework. The patch passed system test framework compile. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/912//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/912//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/912//console This message is automatically generated. Update test to handle relative paths with globs --- Key: HDFS-2038 URL: https://issues.apache.org/jira/browse/HDFS-2038 Project: Hadoop HDFS Issue Type: Test Components: test Affects Versions: 0.23.0 Reporter: Daryn Sharp Assignee: Daryn Sharp Attachments: HDFS-2038-2.patch, HDFS-2038.patch This is providing the test updates for FsShell to retain relativity for paths with globs. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1872) BPOfferService.cleanUp(..) throws NullPointerException
[ https://issues.apache.org/jira/browse/HDFS-1872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13064257#comment-13064257 ] Uma Maheswara Rao G commented on HDFS-1872: --- Bharath, Can you please mention the Jira ID which solved this problem probably? BPOfferService.cleanUp(..) throws NullPointerException -- Key: HDFS-1872 URL: https://issues.apache.org/jira/browse/HDFS-1872 Project: Hadoop HDFS Issue Type: Bug Components: data-node Reporter: Tsz Wo (Nicholas), SZE {noformat} NullPointerException at org.apache.hadoop.hdfs.server.datanode.DataNode$BPOfferService.cleanUp(DataNode.java:1005) at org.apache.hadoop.hdfs.server.datanode.DataNode$BPOfferService.run(DataNode.java:1220) at java.lang.Thread.run(Thread.java:662) {noformat} -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-2131) Tests for HADOOP-7361
[ https://issues.apache.org/jira/browse/HDFS-2131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uma Maheswara Rao G updated HDFS-2131: -- Affects Version/s: 0.23.0 0.20.1 Tests for HADOOP-7361 - Key: HDFS-2131 URL: https://issues.apache.org/jira/browse/HDFS-2131 Project: Hadoop HDFS Issue Type: Test Components: test Affects Versions: 0.20.1, 0.23.0 Reporter: Uma Maheswara Rao G Assignee: Uma Maheswara Rao G Attachments: HADOOP-7361-test.patch, HADOOP-7361-test.patch, HADOOP-7361-test.patch -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-2131) Tests for HADOOP-7361
[ https://issues.apache.org/jira/browse/HDFS-2131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo (Nicholas), SZE updated HDFS-2131: - Status: Patch Available (was: Open) Tests for HADOOP-7361 - Key: HDFS-2131 URL: https://issues.apache.org/jira/browse/HDFS-2131 Project: Hadoop HDFS Issue Type: Test Components: test Affects Versions: 0.20.1, 0.23.0 Reporter: Uma Maheswara Rao G Assignee: Uma Maheswara Rao G Attachments: HADOOP-7361-test.patch, HADOOP-7361-test.patch, HADOOP-7361-test.patch -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-2131) Tests for HADOOP-7361
[ https://issues.apache.org/jira/browse/HDFS-2131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo (Nicholas), SZE updated HDFS-2131: - Status: Open (was: Patch Available) Tests for HADOOP-7361 - Key: HDFS-2131 URL: https://issues.apache.org/jira/browse/HDFS-2131 Project: Hadoop HDFS Issue Type: Test Components: test Affects Versions: 0.20.1, 0.23.0 Reporter: Uma Maheswara Rao G Assignee: Uma Maheswara Rao G Attachments: HADOOP-7361-test.patch, HADOOP-7361-test.patch, HADOOP-7361-test.patch -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-1977) Stop using StringUtils.stringifyException()
[ https://issues.apache.org/jira/browse/HDFS-1977?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jitendra Nath Pandey updated HDFS-1977: --- Resolution: Fixed Status: Resolved (was: Patch Available) I have committed this. Thanks to Bharath! Stop using StringUtils.stringifyException() --- Key: HDFS-1977 URL: https://issues.apache.org/jira/browse/HDFS-1977 Project: Hadoop HDFS Issue Type: Improvement Reporter: Joey Echeverria Assignee: Bharath Mundlapudi Priority: Minor Attachments: HDFS-1977-1.patch, HDFS-1977-2.patch, HDFS-1977-3.patch, HDFS-1977-4.patch The old version of the logging APIs didn't support logging stack traces by passing exceptions to the logging methods (e.g. Log.error()). A number of log statements make use of StringUtils.stringifyException() to get around the old behavior. It would be nice if this could get cleaned up to make use of the the logger's stack trace printing. This also gives users more control since you can configure how the stack traces are written to the logs. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-2131) Tests for HADOOP-7361
[ https://issues.apache.org/jira/browse/HDFS-2131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt Foley updated HDFS-2131: - Attachment: HADOOP-7361-test_v3.patch Nicholas, it's my understanding that having once run against uploaded file [12485690] (HADOOP-7361-test.patch of 08/Jul/11), Jenkins will not test the same file again even if Patch Available is bounced. So I'm uploading another copy of the same file, to trigger auto-test. I've named it _v3 to clarify that it is the same file as the third one Uma uploaded. Uma, in future please change the name of a file when you update it, e.g. by adding _v2 to it. Otherwise we can't tell if the new upload is a new file, or just another copy of the same file. Thanks. Tests for HADOOP-7361 - Key: HDFS-2131 URL: https://issues.apache.org/jira/browse/HDFS-2131 Project: Hadoop HDFS Issue Type: Test Components: test Affects Versions: 0.20.1, 0.23.0 Reporter: Uma Maheswara Rao G Assignee: Uma Maheswara Rao G Attachments: HADOOP-7361-test.patch, HADOOP-7361-test.patch, HADOOP-7361-test.patch, HADOOP-7361-test_v3.patch -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1977) Stop using StringUtils.stringifyException()
[ https://issues.apache.org/jira/browse/HDFS-1977?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13064279#comment-13064279 ] Hudson commented on HDFS-1977: -- Integrated in Hadoop-Hdfs-trunk-Commit #783 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/783/]) HDFS-1977. Stop using StringUtils.stringifyException(). Contributed by Bharath Mundlapudi. jitendra : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1145834 Files : * /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java * /hadoop/common/trunk/hdfs/CHANGES.txt * /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/datanode/BlockPoolSliceScanner.java * /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/DFSInputStream.java * /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/DFSClient.java * /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/datanode/DataStorage.java * /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/datanode/DataXceiverServer.java * /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/datanode/FSDataset.java * /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/balancer/NameNodeConnector.java * /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/datanode/DataXceiver.java * /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/DFSOutputStream.java * /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/datanode/UpgradeObjectDatanode.java * /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/datanode/BlockSender.java * /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/SecondaryNameNode.java * /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/NameNode.java * /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/balancer/Balancer.java * /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java * /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/datanode/BlockReceiver.java Stop using StringUtils.stringifyException() --- Key: HDFS-1977 URL: https://issues.apache.org/jira/browse/HDFS-1977 Project: Hadoop HDFS Issue Type: Improvement Reporter: Joey Echeverria Assignee: Bharath Mundlapudi Priority: Minor Attachments: HDFS-1977-1.patch, HDFS-1977-2.patch, HDFS-1977-3.patch, HDFS-1977-4.patch The old version of the logging APIs didn't support logging stack traces by passing exceptions to the logging methods (e.g. Log.error()). A number of log statements make use of StringUtils.stringifyException() to get around the old behavior. It would be nice if this could get cleaned up to make use of the the logger's stack trace printing. This also gives users more control since you can configure how the stack traces are written to the logs. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2131) Tests for HADOOP-7361
[ https://issues.apache.org/jira/browse/HDFS-2131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13064298#comment-13064298 ] Hadoop QA commented on HDFS-2131: - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12486250/HADOOP-7361-test_v3.patch against trunk revision 1145834. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 15 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. +1 system test framework. The patch passed system test framework compile. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/913//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/913//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/913//console This message is automatically generated. Tests for HADOOP-7361 - Key: HDFS-2131 URL: https://issues.apache.org/jira/browse/HDFS-2131 Project: Hadoop HDFS Issue Type: Test Components: test Affects Versions: 0.20.1, 0.23.0 Reporter: Uma Maheswara Rao G Assignee: Uma Maheswara Rao G Attachments: HADOOP-7361-test.patch, HADOOP-7361-test.patch, HADOOP-7361-test.patch, HADOOP-7361-test_v3.patch -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2131) Tests for HADOOP-7361
[ https://issues.apache.org/jira/browse/HDFS-2131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13064312#comment-13064312 ] Tsz Wo (Nicholas), SZE commented on HDFS-2131: -- Matt, thanks for posting a new file. Tests for HADOOP-7361 - Key: HDFS-2131 URL: https://issues.apache.org/jira/browse/HDFS-2131 Project: Hadoop HDFS Issue Type: Test Components: test Affects Versions: 0.20.1, 0.23.0 Reporter: Uma Maheswara Rao G Assignee: Uma Maheswara Rao G Fix For: 0.23.0 Attachments: HADOOP-7361-test.patch, HADOOP-7361-test.patch, HADOOP-7361-test.patch, HADOOP-7361-test_v3.patch -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-2131) Tests for HADOOP-7361
[ https://issues.apache.org/jira/browse/HDFS-2131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo (Nicholas), SZE updated HDFS-2131: - Resolution: Fixed Fix Version/s: 0.23.0 Hadoop Flags: [Reviewed] Status: Resolved (was: Patch Available) I have committed this. Thanks, Uma. Also thanks Daryn for reviewing it. Tests for HADOOP-7361 - Key: HDFS-2131 URL: https://issues.apache.org/jira/browse/HDFS-2131 Project: Hadoop HDFS Issue Type: Test Components: test Affects Versions: 0.20.1, 0.23.0 Reporter: Uma Maheswara Rao G Assignee: Uma Maheswara Rao G Fix For: 0.23.0 Attachments: HADOOP-7361-test.patch, HADOOP-7361-test.patch, HADOOP-7361-test.patch, HADOOP-7361-test_v3.patch -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2131) Tests for HADOOP-7361
[ https://issues.apache.org/jira/browse/HDFS-2131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13064321#comment-13064321 ] Hudson commented on HDFS-2131: -- Integrated in Hadoop-Hdfs-trunk-Commit #784 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/784/]) HDFS-2131. Add new tests for the -overwrite/-f option in put and copyFromLocal by HADOOP-7361. Contributed by Uma Maheswara Rao G szetszwo : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1145843 Files : * /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/cli/testHDFSConf.xml * /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/TestDFSShell.java * /hadoop/common/trunk/hdfs/CHANGES.txt Tests for HADOOP-7361 - Key: HDFS-2131 URL: https://issues.apache.org/jira/browse/HDFS-2131 Project: Hadoop HDFS Issue Type: Test Components: test Affects Versions: 0.20.1, 0.23.0 Reporter: Uma Maheswara Rao G Assignee: Uma Maheswara Rao G Fix For: 0.23.0 Attachments: HADOOP-7361-test.patch, HADOOP-7361-test.patch, HADOOP-7361-test.patch, HADOOP-7361-test_v3.patch -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-347) DFS read performance suboptimal when client co-located on nodes with data
[ https://issues.apache.org/jira/browse/HDFS-347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13064352#comment-13064352 ] dhruba borthakur commented on HDFS-347: --- we do not run with HDFS-941. I will post numbers once I get that incorporated into our production environment. DFS read performance suboptimal when client co-located on nodes with data - Key: HDFS-347 URL: https://issues.apache.org/jira/browse/HDFS-347 Project: Hadoop HDFS Issue Type: Improvement Reporter: George Porter Assignee: Todd Lipcon Attachments: BlockReaderLocal1.txt, HADOOP-4801.1.patch, HADOOP-4801.2.patch, HADOOP-4801.3.patch, HDFS-347-branch-20-append.txt, all.tsv, hdfs-347.png, hdfs-347.txt, local-reads-doc One of the major strategies Hadoop uses to get scalable data processing is to move the code to the data. However, putting the DFS client on the same physical node as the data blocks it acts on doesn't improve read performance as much as expected. After looking at Hadoop and O/S traces (via HADOOP-4049), I think the problem is due to the HDFS streaming protocol causing many more read I/O operations (iops) than necessary. Consider the case of a DFSClient fetching a 64 MB disk block from the DataNode process (running in a separate JVM) running on the same machine. The DataNode will satisfy the single disk block request by sending data back to the HDFS client in 64-KB chunks. In BlockSender.java, this is done in the sendChunk() method, relying on Java's transferTo() method. Depending on the host O/S and JVM implementation, transferTo() is implemented as either a sendfilev() syscall or a pair of mmap() and write(). In either case, each chunk is read from the disk by issuing a separate I/O operation for each chunk. The result is that the single request for a 64-MB block ends up hitting the disk as over a thousand smaller requests for 64-KB each. Since the DFSClient runs in a different JVM and process than the DataNode, shuttling data from the disk to the DFSClient also results in context switches each time network packets get sent (in this case, the 64-kb chunk turns into a large number of 1500 byte packet send operations). Thus we see a large number of context switches for each block send operation. I'd like to get some feedback on the best way to address this, but I think providing a mechanism for a DFSClient to directly open data blocks that happen to be on the same machine. It could do this by examining the set of LocatedBlocks returned by the NameNode, marking those that should be resident on the local host. Since the DataNode and DFSClient (probably) share the same hadoop configuration, the DFSClient should be able to find the files holding the block data, and it could directly open them and send data back to the client. This would avoid the context switches imposed by the network layer, and would allow for much larger read buffers than 64KB, which should reduce the number of iops imposed by each read block operation. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2138) fix aop.xml to refer to the right hadoop-common.version variable
[ https://issues.apache.org/jira/browse/HDFS-2138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13064359#comment-13064359 ] Konstantin Boudnik commented on HDFS-2138: -- I have checked that correct version is set for the jar files. Everything works. +1 Good stuff, Giri. Thanks for fixing this. fix aop.xml to refer to the right hadoop-common.version variable Key: HDFS-2138 URL: https://issues.apache.org/jira/browse/HDFS-2138 Project: Hadoop HDFS Issue Type: Bug Components: build Affects Versions: 0.23.0 Reporter: Giridharan Kesavan Assignee: Giridharan Kesavan Attachments: HDFS-2138-trunk.patch, HDFS-2138.PATCH aop.xml refers to hadoop-common version through project.version variable; Instead hadoop-common version should be referred through hadoop-common.version set in ivy/libraries.properties file. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira