[jira] [Commented] (HDFS-3672) Expose disk-location information for blocks to enable better scheduling
[ https://issues.apache.org/jira/browse/HDFS-3672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13433967#comment-13433967 ] Hadoop QA commented on HDFS-3672: - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12540815/hdfs-3672-9.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 2 new or modified test files. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 javadoc. The javadoc tool did not generate any warning messages. +1 eclipse:eclipse. The patch built with eclipse:eclipse. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.server.namenode.TestFsck +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/3001//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/3001//console This message is automatically generated. Expose disk-location information for blocks to enable better scheduling --- Key: HDFS-3672 URL: https://issues.apache.org/jira/browse/HDFS-3672 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 2.0.0-alpha Reporter: Andrew Wang Assignee: Andrew Wang Attachments: design-doc-v1.pdf, design-doc-v2.pdf, hdfs-3672-1.patch, hdfs-3672-2.patch, hdfs-3672-3.patch, hdfs-3672-4.patch, hdfs-3672-5.patch, hdfs-3672-6.patch, hdfs-3672-7.patch, hdfs-3672-8.patch, hdfs-3672-9.patch Currently, HDFS exposes on which datanodes a block resides, which allows clients to make scheduling decisions for locality and load balancing. Extending this to also expose on which disk on a datanode a block resides would enable even better scheduling, on a per-disk rather than coarse per-datanode basis. This API would likely look similar to Filesystem#getFileBlockLocations, but also involve a series of RPCs to the responsible datanodes to determine disk ids. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3723) All commands should support meaningful --help
[ https://issues.apache.org/jira/browse/HDFS-3723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13433977#comment-13433977 ] Hadoop QA commented on HDFS-3723: - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12540826/HDFS-3723.001.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 2 new or modified test files. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 javadoc. The javadoc tool did not generate any warning messages. +1 eclipse:eclipse. The patch built with eclipse:eclipse. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests in hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.ha.TestHAAdmin org.apache.hadoop.ha.TestZKFailoverController org.apache.hadoop.hdfs.tools.TestDFSHAAdmin org.apache.hadoop.hdfs.server.namenode.metrics.TestNameNodeMetrics +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/3000//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/3000//console This message is automatically generated. All commands should support meaningful --help - Key: HDFS-3723 URL: https://issues.apache.org/jira/browse/HDFS-3723 Project: Hadoop HDFS Issue Type: Improvement Components: scripts, tools Affects Versions: 2.0.0-alpha Reporter: E. Sammer Assignee: Jing Zhao Attachments: HDFS-3723.001.patch, HDFS-3723.001.patch, HDFS-3723.patch, HDFS-3723.patch Some (sub)commands support -help or -h options for detailed help while others do not. Ideally, all commands should support meaningful help that works regardless of current state or configuration. For example, hdfs zkfc --help (or -h or -help) is not very useful. Option checking should occur before state / configuration checking. {code} [esammer@hadoop-fed01 ~]# hdfs zkfc --help Exception in thread main org.apache.hadoop.HadoopIllegalArgumentException: HA is not enabled for this namenode. at org.apache.hadoop.hdfs.tools.DFSZKFailoverController.setConf(DFSZKFailoverController.java:122) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:66) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84) at org.apache.hadoop.hdfs.tools.DFSZKFailoverController.main(DFSZKFailoverController.java:168) {code} This would go a long way toward better usability for ops staff. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3723) All commands should support meaningful --help
[ https://issues.apache.org/jira/browse/HDFS-3723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jing Zhao updated HDFS-3723: Attachment: HDFS-3723.002.patch Updated TestDFSHAAdmin and TestHAAdmin. The expected help information now may also be contained in normal output (originally only contained in erroutput) All commands should support meaningful --help - Key: HDFS-3723 URL: https://issues.apache.org/jira/browse/HDFS-3723 Project: Hadoop HDFS Issue Type: Improvement Components: scripts, tools Affects Versions: 2.0.0-alpha Reporter: E. Sammer Assignee: Jing Zhao Attachments: HDFS-3723.001.patch, HDFS-3723.001.patch, HDFS-3723.002.patch, HDFS-3723.patch, HDFS-3723.patch Some (sub)commands support -help or -h options for detailed help while others do not. Ideally, all commands should support meaningful help that works regardless of current state or configuration. For example, hdfs zkfc --help (or -h or -help) is not very useful. Option checking should occur before state / configuration checking. {code} [esammer@hadoop-fed01 ~]# hdfs zkfc --help Exception in thread main org.apache.hadoop.HadoopIllegalArgumentException: HA is not enabled for this namenode. at org.apache.hadoop.hdfs.tools.DFSZKFailoverController.setConf(DFSZKFailoverController.java:122) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:66) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84) at org.apache.hadoop.hdfs.tools.DFSZKFailoverController.main(DFSZKFailoverController.java:168) {code} This would go a long way toward better usability for ops staff. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3150) Add option for clients to contact DNs via hostname
[ https://issues.apache.org/jira/browse/HDFS-3150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13434046#comment-13434046 ] Hadoop QA commented on HDFS-3150: - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12540836/hdfs-3150.txt against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 10 new or modified test files. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 javadoc. The javadoc tool did not generate any warning messages. +1 eclipse:eclipse. The patch built with eclipse:eclipse. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed unit tests in hadoop-hdfs-project/hadoop-hdfs. +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/3002//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/3002//console This message is automatically generated. Add option for clients to contact DNs via hostname -- Key: HDFS-3150 URL: https://issues.apache.org/jira/browse/HDFS-3150 Project: Hadoop HDFS Issue Type: New Feature Components: data-node, hdfs client Affects Versions: 1.0.0, 2.0.0-alpha Reporter: Eli Collins Assignee: Eli Collins Fix For: 1.1.0 Attachments: hdfs-3150-b1.txt, hdfs-3150-b1.txt, hdfs-3150.txt, hdfs-3150.txt, hdfs-3150.txt, hdfs-3150.txt The DN listens on multiple IP addresses (the default {{dfs.datanode.address}} is the wildcard) however per HADOOP-6867 only the source address (IP) of the registration is given to clients. HADOOP-985 made clients access datanodes by IP primarily to avoid the latency of a DNS lookup, this had the side effect of breaking DN multihoming (the client can not route the IP exposed by the NN if the DN registers with an interface that has a cluster-private IP). To fix this let's add back the option for Datanodes to be accessed by hostname. This can be done by: # Modifying the primary field of the Datanode descriptor to be the hostname, or # Modifying Client/Datanode - Datanode access use the hostname field instead of the IP Approach #2 does not require an incompatible client protocol change, and is much less invasive. It minimizes the scope of modification to just places where clients and Datanodes connect, vs changing all uses of Datanode identifiers. New client and Datanode configuration options are introduced: - {{dfs.client.use.datanode.hostname}} indicates all client to datanode connections should use the datanode hostname (as clients outside cluster may not be able to route the IP) - {{dfs.datanode.use.datanode.hostname}} indicates whether Datanodes should use hostnames when connecting to other Datanodes for data transfer If the configuration options are not used, there is no change in the current behavior. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3801) Provide a way to disable browsing of files from the web UI
[ https://issues.apache.org/jira/browse/HDFS-3801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13434064#comment-13434064 ] Harsh J commented on HDFS-3801: --- Suresh - The general need seems to be to prevent users external to the group that uses HDFS to read/browse files. Now this can be done by enabling the kerberos hadoop.http.authentication.type but not many users need the web file browsing facility itself, and hence it would be beneficial if this can be toggled off to prevent anyone (in or out of the group). This would also help as a toggle on non-secure installations. Provide a way to disable browsing of files from the web UI -- Key: HDFS-3801 URL: https://issues.apache.org/jira/browse/HDFS-3801 Project: Hadoop HDFS Issue Type: Improvement Components: name-node Affects Versions: 2.0.0-alpha Reporter: Harsh J Priority: Minor A few times we've had requests from users who wish to disable browsing of the filesystem in the web UI completely, while keeping other servlet functionality enabled (such as fsck, etc.). Right now, the cheap way to do this is by blocking out the DN web port (50075) from access by clients, but that also hampers HFTP transfers. We should instead provide a toggle config for the JSPs to use and disallow browsing if the toggle's enabled. The config can be true by default, to not change the behavior. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3788) distcp can't copy large files using webhdfs due to missing Content-Length header
[ https://issues.apache.org/jira/browse/HDFS-3788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13434098#comment-13434098 ] Daryn Sharp commented on HDFS-3788: --- bq. Nicholas: How about first check the transfer-encoding, if it is chunked, then no content-length check? Exactly. However, you need to update the patch to check both Transfer-Encoding and TE headers, and the headers may contain multiple comma separated values. I haven't tested, but I would expect Java's input stream for chunked responses to throw an EOF exception if the connection is broken so you might want to add a test for that. bq. Eli: Note that a get of a 3gb file works but not distcp, what path is different? The code paths should be identical since it's the creation of the input stream that does the content-length check. I can't see how distcp could possibly work unless distcp is not using the filesystem class... distcp can't copy large files using webhdfs due to missing Content-Length header Key: HDFS-3788 URL: https://issues.apache.org/jira/browse/HDFS-3788 Project: Hadoop HDFS Issue Type: Bug Components: webhdfs Affects Versions: 0.23.3, 2.0.0-alpha Reporter: Eli Collins Assignee: Tsz Wo (Nicholas), SZE Priority: Critical Attachments: distcp-webhdfs-errors.txt, h3788_20120813.patch The following command fails when data1 contains a 3gb file. It passes when using hftp or when the directory just contains smaller (2gb) files, so looks like a webhdfs issue with large files. {{hadoop distcp webhdfs://eli-thinkpad:50070/user/eli/data1 hdfs://localhost:8020/user/eli/data2}} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3794) WebHDFS Open used with Offset returns the original (and incorrect) Content Length in the HTTP Header.
[ https://issues.apache.org/jira/browse/HDFS-3794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13434102#comment-13434102 ] Daryn Sharp commented on HDFS-3794: --- Shouldn't it check if length - offset goes negative? Or is that checked elsewhere? WebHDFS Open used with Offset returns the original (and incorrect) Content Length in the HTTP Header. - Key: HDFS-3794 URL: https://issues.apache.org/jira/browse/HDFS-3794 Project: Hadoop HDFS Issue Type: Bug Components: webhdfs Affects Versions: 0.23.3, 2.0.0-alpha, 2.1.0-alpha Reporter: Ravi Prakash Assignee: Ravi Prakash Attachments: HDFS-3794.patch When an offset is specified, the HTTP header Content Length still contains the original file size. e.g. if the original file is 100 bytes, and the offset specified it 10, then HTTP Content Length ought to be 90. Currently it is still returned as 100. This causes curl to give error 18, and JAVA to throw ConnectionClosedException -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3150) Add option for clients to contact DNs via hostname
[ https://issues.apache.org/jira/browse/HDFS-3150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13434112#comment-13434112 ] Daryn Sharp commented on HDFS-3150: --- Are you intending to address the token issues too? Or did I overlook that in my skim of the new patch? I'm still wondering if we should deprecate use_ip and unify that and the two new keys? Thoughts? Add option for clients to contact DNs via hostname -- Key: HDFS-3150 URL: https://issues.apache.org/jira/browse/HDFS-3150 Project: Hadoop HDFS Issue Type: New Feature Components: data-node, hdfs client Affects Versions: 1.0.0, 2.0.0-alpha Reporter: Eli Collins Assignee: Eli Collins Fix For: 1.1.0 Attachments: hdfs-3150-b1.txt, hdfs-3150-b1.txt, hdfs-3150.txt, hdfs-3150.txt, hdfs-3150.txt, hdfs-3150.txt The DN listens on multiple IP addresses (the default {{dfs.datanode.address}} is the wildcard) however per HADOOP-6867 only the source address (IP) of the registration is given to clients. HADOOP-985 made clients access datanodes by IP primarily to avoid the latency of a DNS lookup, this had the side effect of breaking DN multihoming (the client can not route the IP exposed by the NN if the DN registers with an interface that has a cluster-private IP). To fix this let's add back the option for Datanodes to be accessed by hostname. This can be done by: # Modifying the primary field of the Datanode descriptor to be the hostname, or # Modifying Client/Datanode - Datanode access use the hostname field instead of the IP Approach #2 does not require an incompatible client protocol change, and is much less invasive. It minimizes the scope of modification to just places where clients and Datanodes connect, vs changing all uses of Datanode identifiers. New client and Datanode configuration options are introduced: - {{dfs.client.use.datanode.hostname}} indicates all client to datanode connections should use the datanode hostname (as clients outside cluster may not be able to route the IP) - {{dfs.datanode.use.datanode.hostname}} indicates whether Datanodes should use hostnames when connecting to other Datanodes for data transfer If the configuration options are not used, there is no change in the current behavior. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3597) SNN can fail to start on upgrade
[ https://issues.apache.org/jira/browse/HDFS-3597?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daryn Sharp updated HDFS-3597: -- Fix Version/s: 0.23.3 I've committed to 23. SNN can fail to start on upgrade Key: HDFS-3597 URL: https://issues.apache.org/jira/browse/HDFS-3597 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.0.0-alpha Reporter: Andy Isaacson Assignee: Andy Isaacson Priority: Minor Fix For: 0.23.3, 3.0.0, 2.2.0-alpha Attachments: hdfs-3597-2.txt, hdfs-3597-3.txt, hdfs-3597-4.txt, hdfs-3597.txt When upgrading from 1.x to 2.0.0, the SecondaryNameNode can fail to start up: {code} 2012-06-16 09:52:33,812 ERROR org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode: Exception in doCheckpoint java.io.IOException: Inconsistent checkpoint fields. LV = -40 namespaceID = 64415959 cTime = 1339813974990 ; clusterId = CID-07a82b97-8d04-4fdd-b3a1-f40650163245 ; blockpoolId = BP-1792677198-172.29.121.67-1339813967723. Expecting respectively: -19; 64415959; 0; ; . at org.apache.hadoop.hdfs.server.namenode.CheckpointSignature.validateStorageInfo(CheckpointSignature.java:120) at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.doCheckpoint(SecondaryNameNode.java:454) at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.doWork(SecondaryNameNode.java:334) at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode$2.run(SecondaryNameNode.java:301) at org.apache.hadoop.security.SecurityUtil.doAsLoginUserOrFatal(SecurityUtil.java:438) at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.run(SecondaryNameNode.java:297) at java.lang.Thread.run(Thread.java:662) {code} The error check we're hitting came from HDFS-1073, and it's intended to verify that we're connecting to the correct NN. But the check is too strict and considers different metadata version to be the same as different clusterID. I believe the check in {{doCheckpoint}} simply needs to explicitly check for and handle the update case. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3731) 2.0 release upgrade must handle blocks being written from 1.0
[ https://issues.apache.org/jira/browse/HDFS-3731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13434190#comment-13434190 ] Robert Joseph Evans commented on HDFS-3731: --- I am not an HDFS expert but the patch looks good to me. +1 non-binding. 2.0 release upgrade must handle blocks being written from 1.0 - Key: HDFS-3731 URL: https://issues.apache.org/jira/browse/HDFS-3731 Project: Hadoop HDFS Issue Type: Bug Components: data-node Affects Versions: 2.0.0-alpha Reporter: Suresh Srinivas Assignee: Colin Patrick McCabe Priority: Blocker Attachments: HDFS-3731.002.patch, HDFS-3731.003.patch Release 2.0 upgrades must handle blocks being written to (bbw) files from 1.0 release. Problem reported by Brahma Reddy. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3794) WebHDFS Open used with Offset returns the original (and incorrect) Content Length in the HTTP Header.
[ https://issues.apache.org/jira/browse/HDFS-3794?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ravi Prakash updated HDFS-3794: --- Attachment: HDFS-3794.patch Thanks a lot Nicholas! I'm afraid I don't know enough about the code. I'll defer to you on this! I'm attaching the modified patch with the change you suggested. Thanks Daryn. It discovers an out of range offset and throws an exception before reaching this method. {noformat} $ curl -L http://HOST:PORT/webhdfs/v1/somePath/someFile?op=OPENoffset=457236547; {RemoteException:{exception:IOException,javaClassName:java.io.IOException,message:Offset=457236547 out of the range [0, 457236477); OPEN, path=/somePath/someFile}} {noformat} WebHDFS Open used with Offset returns the original (and incorrect) Content Length in the HTTP Header. - Key: HDFS-3794 URL: https://issues.apache.org/jira/browse/HDFS-3794 Project: Hadoop HDFS Issue Type: Bug Components: webhdfs Affects Versions: 0.23.3, 2.0.0-alpha, 2.1.0-alpha Reporter: Ravi Prakash Assignee: Ravi Prakash Attachments: HDFS-3794.patch, HDFS-3794.patch When an offset is specified, the HTTP header Content Length still contains the original file size. e.g. if the original file is 100 bytes, and the offset specified it 10, then HTTP Content Length ought to be 90. Currently it is still returned as 100. This causes curl to give error 18, and JAVA to throw ConnectionClosedException -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3794) WebHDFS Open used with Offset returns the original (and incorrect) Content Length in the HTTP Header.
[ https://issues.apache.org/jira/browse/HDFS-3794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13434193#comment-13434193 ] Ravi Prakash commented on HDFS-3794: I tested the modified patch and it too worked in all three previous cases. WebHDFS Open used with Offset returns the original (and incorrect) Content Length in the HTTP Header. - Key: HDFS-3794 URL: https://issues.apache.org/jira/browse/HDFS-3794 Project: Hadoop HDFS Issue Type: Bug Components: webhdfs Affects Versions: 0.23.3, 2.0.0-alpha, 2.1.0-alpha Reporter: Ravi Prakash Assignee: Ravi Prakash Attachments: HDFS-3794.patch, HDFS-3794.patch When an offset is specified, the HTTP header Content Length still contains the original file size. e.g. if the original file is 100 bytes, and the offset specified it 10, then HTTP Content Length ought to be 90. Currently it is still returned as 100. This causes curl to give error 18, and JAVA to throw ConnectionClosedException -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3796) Speed up edit log tests by avoiding fsync()
[ https://issues.apache.org/jira/browse/HDFS-3796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13434204#comment-13434204 ] Colin Patrick McCabe commented on HDFS-3796: Great idea. I think we should also do this in {{TestNameNodeRecovery}}, {{TestFileJournalManager}}, {{TestSecurityTokenEditLog}}, {{TestEditLogsDuringFailover}}, and {{TestEditLogFileOutputStream}}. Speed up edit log tests by avoiding fsync() --- Key: HDFS-3796 URL: https://issues.apache.org/jira/browse/HDFS-3796 Project: Hadoop HDFS Issue Type: Improvement Components: test Affects Versions: 3.0.0, 2.2.0-alpha Reporter: Todd Lipcon Assignee: Todd Lipcon Priority: Minor Attachments: hdfs-3796.txt Our edit log tests are very slow because they incur a lot of fsyncs as they write out transactions. Since fsync() has no effect except in the case of power outages or system crashes, and we don't care about power outages in the context of tests, we can safely skip the fsync without any loss in coverage. In my tests, this sped up TestEditLog by about 5x. The testFuzzSequences test case improved from ~83 seconds with fsync to about 5 seconds without. These results are from my SSD laptop - they are probably even more drastic on spinning media. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3791) Backport HDFS-173 to Branch-1 : Recursively deleting a directory with millions of files makes NameNode unresponsive for other commands until the deletion completes
[ https://issues.apache.org/jira/browse/HDFS-3791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uma Maheswara Rao G updated HDFS-3791: -- Summary: Backport HDFS-173 to Branch-1 : Recursively deleting a directory with millions of files makes NameNode unresponsive for other commands until the deletion completes (was: Backport HDFS-173 Recursively deleting a directory with millions of files makes NameNode unresponsive for other commands until the deletion completes) Backport HDFS-173 to Branch-1 : Recursively deleting a directory with millions of files makes NameNode unresponsive for other commands until the deletion completes Key: HDFS-3791 URL: https://issues.apache.org/jira/browse/HDFS-3791 Project: Hadoop HDFS Issue Type: Bug Components: name-node Affects Versions: 1.0.0 Reporter: Uma Maheswara Rao G Assignee: Uma Maheswara Rao G Backport HDFS-173. see the [comment|https://issues.apache.org/jira/browse/HDFS-2815?focusedCommentId=13422007page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13422007] for more details -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3791) Backport HDFS-173 to Branch-1 : Recursively deleting a directory with millions of files makes NameNode unresponsive for other commands until the deletion completes
[ https://issues.apache.org/jira/browse/HDFS-3791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uma Maheswara Rao G updated HDFS-3791: -- Attachment: HDFS-3791.patch Backport HDFS-173 to Branch-1 : Recursively deleting a directory with millions of files makes NameNode unresponsive for other commands until the deletion completes Key: HDFS-3791 URL: https://issues.apache.org/jira/browse/HDFS-3791 Project: Hadoop HDFS Issue Type: Bug Components: name-node Affects Versions: 1.0.0 Reporter: Uma Maheswara Rao G Assignee: Uma Maheswara Rao G Attachments: HDFS-3791.patch Backport HDFS-173. see the [comment|https://issues.apache.org/jira/browse/HDFS-2815?focusedCommentId=13422007page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13422007] for more details -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3791) Backport HDFS-173 to Branch-1 : Recursively deleting a directory with millions of files makes NameNode unresponsive for other commands until the deletion completes
[ https://issues.apache.org/jira/browse/HDFS-3791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13434209#comment-13434209 ] Uma Maheswara Rao G commented on HDFS-3791: --- Hi Suresh, I have just attached a back-port patch here. Could you please take a look. Backport HDFS-173 to Branch-1 : Recursively deleting a directory with millions of files makes NameNode unresponsive for other commands until the deletion completes Key: HDFS-3791 URL: https://issues.apache.org/jira/browse/HDFS-3791 Project: Hadoop HDFS Issue Type: Bug Components: name-node Affects Versions: 1.0.0 Reporter: Uma Maheswara Rao G Assignee: Uma Maheswara Rao G Attachments: HDFS-3791.patch Backport HDFS-173. see the [comment|https://issues.apache.org/jira/browse/HDFS-2815?focusedCommentId=13422007page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13422007] for more details -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3048) Small race in BlockManager#close
[ https://issues.apache.org/jira/browse/HDFS-3048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13434212#comment-13434212 ] Karthik Kambatla commented on HDFS-3048: +1 Small race in BlockManager#close Key: HDFS-3048 URL: https://issues.apache.org/jira/browse/HDFS-3048 Project: Hadoop HDFS Issue Type: Bug Components: name-node Affects Versions: 2.0.0-alpha Reporter: Eli Collins Assignee: Andy Isaacson Attachments: hdfs-3048.txt, hdfs-3787-2.txt There's a small race in BlockManager#close, we close the BlocksMap before the replication monitor, which means the replication monitor can NPE if it tries to access the blocks map. We need to swap the order (close the blocks map after shutting down the repl monitor). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3658) TestDFSClientRetries#testNamenodeRestart failed
[ https://issues.apache.org/jira/browse/HDFS-3658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13434213#comment-13434213 ] Aaron T. Myers commented on HDFS-3658: -- Hi Nicholas, did you commit this to branch-2.1.0-alpha as well as branch-2? If not, I believe the fix version should be set to 2.2.0-alpha. Do you agree? TestDFSClientRetries#testNamenodeRestart failed --- Key: HDFS-3658 URL: https://issues.apache.org/jira/browse/HDFS-3658 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.0.0-alpha Reporter: Eli Collins Assignee: Tsz Wo (Nicholas), SZE Fix For: 1.2.0, 2.1.0-alpha Attachments: h3658_20120808_b-1.patch, h3658_20120808.patch, test-log.txt Saw the following fail on a jenkins run: {noformat} Error Message expected:MD5-of-0MD5-of-512CRC32:f397fb3d9133d0a8f55854ea2bb268b0 but was:MD5-of-0MD5-of-0CRC32:70bc8f4b72a86921468bf8e8441dce51 Stacktrace junit.framework.AssertionFailedError: expected:MD5-of-0MD5-of-512CRC32:f397fb3d9133d0a8f55854ea2bb268b0 but was:MD5-of-0MD5-of-0CRC32:70bc8f4b72a86921468bf8e8441dce51 at junit.framework.Assert.fail(Assert.java:47) at junit.framework.Assert.failNotEquals(Assert.java:283) at junit.framework.Assert.assertEquals(Assert.java:64) at junit.framework.Assert.assertEquals(Assert.java:71) at org.apache.hadoop.hdfs.TestDFSClientRetries.testNamenodeRestart(TestDFSClientRetries.java:886) {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3258) Test for HADOOP-8144 (pseudoSortByDistance in NetworkTopology for first rack local node)
[ https://issues.apache.org/jira/browse/HDFS-3258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daryn Sharp updated HDFS-3258: -- Fix Version/s: 0.23.3 Merged to branch 23. Test for HADOOP-8144 (pseudoSortByDistance in NetworkTopology for first rack local node) Key: HDFS-3258 URL: https://issues.apache.org/jira/browse/HDFS-3258 Project: Hadoop HDFS Issue Type: Test Components: test Affects Versions: 0.23.0, 1.0.0 Reporter: Eli Collins Assignee: Junping Du Fix For: 0.23.3, 2.0.0-alpha Attachments: HDFS-3258.patch, hdfs-3258.txt For updating TestNetworkTopology to cover HADOOP-8144. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3788) distcp can't copy large files using webhdfs due to missing Content-Length header
[ https://issues.apache.org/jira/browse/HDFS-3788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo (Nicholas), SZE updated HDFS-3788: - Attachment: h3788_20120814.patch Thanks Daryn for taking a look. Here is a new patch: h3788_20120814.patch I have run 3GB file test included in the previous patch. The test will not be committed since it takes 10 minutes. distcp can't copy large files using webhdfs due to missing Content-Length header Key: HDFS-3788 URL: https://issues.apache.org/jira/browse/HDFS-3788 Project: Hadoop HDFS Issue Type: Bug Components: webhdfs Affects Versions: 0.23.3, 2.0.0-alpha Reporter: Eli Collins Assignee: Tsz Wo (Nicholas), SZE Priority: Critical Attachments: distcp-webhdfs-errors.txt, h3788_20120813.patch, h3788_20120814.patch The following command fails when data1 contains a 3gb file. It passes when using hftp or when the directory just contains smaller (2gb) files, so looks like a webhdfs issue with large files. {{hadoop distcp webhdfs://eli-thinkpad:50070/user/eli/data1 hdfs://localhost:8020/user/eli/data2}} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3658) TestDFSClientRetries#testNamenodeRestart failed
[ https://issues.apache.org/jira/browse/HDFS-3658?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo (Nicholas), SZE updated HDFS-3658: - Environment: Aaron, you are right. It should be 2.2.0-alpha. Fix Version/s: (was: 2.1.0-alpha) 2.2.0-alpha TestDFSClientRetries#testNamenodeRestart failed --- Key: HDFS-3658 URL: https://issues.apache.org/jira/browse/HDFS-3658 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.0.0-alpha Environment: Aaron, you are right. It should be 2.2.0-alpha. Reporter: Eli Collins Assignee: Tsz Wo (Nicholas), SZE Fix For: 1.2.0, 2.2.0-alpha Attachments: h3658_20120808_b-1.patch, h3658_20120808.patch, test-log.txt Saw the following fail on a jenkins run: {noformat} Error Message expected:MD5-of-0MD5-of-512CRC32:f397fb3d9133d0a8f55854ea2bb268b0 but was:MD5-of-0MD5-of-0CRC32:70bc8f4b72a86921468bf8e8441dce51 Stacktrace junit.framework.AssertionFailedError: expected:MD5-of-0MD5-of-512CRC32:f397fb3d9133d0a8f55854ea2bb268b0 but was:MD5-of-0MD5-of-0CRC32:70bc8f4b72a86921468bf8e8441dce51 at junit.framework.Assert.fail(Assert.java:47) at junit.framework.Assert.failNotEquals(Assert.java:283) at junit.framework.Assert.assertEquals(Assert.java:64) at junit.framework.Assert.assertEquals(Assert.java:71) at org.apache.hadoop.hdfs.TestDFSClientRetries.testNamenodeRestart(TestDFSClientRetries.java:886) {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3801) Provide a way to disable browsing of files from the web UI
[ https://issues.apache.org/jira/browse/HDFS-3801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13434268#comment-13434268 ] Harsh J commented on HDFS-3801: --- Thanks Steve. That should be possible to do, given that we mostly seem to call the JspHelper class methods. Provide a way to disable browsing of files from the web UI -- Key: HDFS-3801 URL: https://issues.apache.org/jira/browse/HDFS-3801 Project: Hadoop HDFS Issue Type: Improvement Components: name-node Affects Versions: 2.0.0-alpha Reporter: Harsh J Priority: Minor A few times we've had requests from users who wish to disable browsing of the filesystem in the web UI completely, while keeping other servlet functionality enabled (such as fsck, etc.). Right now, the cheap way to do this is by blocking out the DN web port (50075) from access by clients, but that also hampers HFTP transfers. We should instead provide a toggle config for the JSPs to use and disallow browsing if the toggle's enabled. The config can be true by default, to not change the behavior. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3788) distcp can't copy large files using webhdfs due to missing Content-Length header
[ https://issues.apache.org/jira/browse/HDFS-3788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13434276#comment-13434276 ] Eli Collins commented on HDFS-3788: --- bq. Do you mean fs -get? No way, they should have the same code path. Are you sure that both server and client were running trunk? Yes, hadoop fs -get of a 3gb file works but distcp of the directory containing that file fails. And yes, using a trunk build for everything, just running this via a pseudo distributed tarball install on my laptop. Can you explain what the bug is and the relevant fix? I don't see why we were not setting the content length header as we do that unconditionally on the server side. distcp can't copy large files using webhdfs due to missing Content-Length header Key: HDFS-3788 URL: https://issues.apache.org/jira/browse/HDFS-3788 Project: Hadoop HDFS Issue Type: Bug Components: webhdfs Affects Versions: 0.23.3, 2.0.0-alpha Reporter: Eli Collins Assignee: Tsz Wo (Nicholas), SZE Priority: Critical Attachments: distcp-webhdfs-errors.txt, h3788_20120813.patch, h3788_20120814.patch The following command fails when data1 contains a 3gb file. It passes when using hftp or when the directory just contains smaller (2gb) files, so looks like a webhdfs issue with large files. {{hadoop distcp webhdfs://eli-thinkpad:50070/user/eli/data1 hdfs://localhost:8020/user/eli/data2}} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3150) Add option for clients to contact DNs via hostname
[ https://issues.apache.org/jira/browse/HDFS-3150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13434277#comment-13434277 ] Eli Collins commented on HDFS-3150: --- These new options are distinct from hadoop.security.token.service.use_ip, it would be reasonable to have them set to true and use_ip to false and vice versa. Note that these only affect client - DN and DN - DN where use_ip is Client - NN. There's really not much overlap. Add option for clients to contact DNs via hostname -- Key: HDFS-3150 URL: https://issues.apache.org/jira/browse/HDFS-3150 Project: Hadoop HDFS Issue Type: New Feature Components: data-node, hdfs client Affects Versions: 1.0.0, 2.0.0-alpha Reporter: Eli Collins Assignee: Eli Collins Fix For: 1.1.0 Attachments: hdfs-3150-b1.txt, hdfs-3150-b1.txt, hdfs-3150.txt, hdfs-3150.txt, hdfs-3150.txt, hdfs-3150.txt The DN listens on multiple IP addresses (the default {{dfs.datanode.address}} is the wildcard) however per HADOOP-6867 only the source address (IP) of the registration is given to clients. HADOOP-985 made clients access datanodes by IP primarily to avoid the latency of a DNS lookup, this had the side effect of breaking DN multihoming (the client can not route the IP exposed by the NN if the DN registers with an interface that has a cluster-private IP). To fix this let's add back the option for Datanodes to be accessed by hostname. This can be done by: # Modifying the primary field of the Datanode descriptor to be the hostname, or # Modifying Client/Datanode - Datanode access use the hostname field instead of the IP Approach #2 does not require an incompatible client protocol change, and is much less invasive. It minimizes the scope of modification to just places where clients and Datanodes connect, vs changing all uses of Datanode identifiers. New client and Datanode configuration options are introduced: - {{dfs.client.use.datanode.hostname}} indicates all client to datanode connections should use the datanode hostname (as clients outside cluster may not be able to route the IP) - {{dfs.datanode.use.datanode.hostname}} indicates whether Datanodes should use hostnames when connecting to other Datanodes for data transfer If the configuration options are not used, there is no change in the current behavior. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3731) 2.0 release upgrade must handle blocks being written from 1.0
[ https://issues.apache.org/jira/browse/HDFS-3731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13434284#comment-13434284 ] Suresh Srinivas commented on HDFS-3731: --- Colin, can you please add description about the final approach you are taking to solve this problem. 2.0 release upgrade must handle blocks being written from 1.0 - Key: HDFS-3731 URL: https://issues.apache.org/jira/browse/HDFS-3731 Project: Hadoop HDFS Issue Type: Bug Components: data-node Affects Versions: 2.0.0-alpha Reporter: Suresh Srinivas Assignee: Colin Patrick McCabe Priority: Blocker Attachments: HDFS-3731.002.patch, HDFS-3731.003.patch Release 2.0 upgrades must handle blocks being written to (bbw) files from 1.0 release. Problem reported by Brahma Reddy. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3772) HDFS NN will hang in safe mode and never come out if we change the dfs.namenode.replication.min bigger.
[ https://issues.apache.org/jira/browse/HDFS-3772?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13434293#comment-13434293 ] Konstantin Shvachko commented on HDFS-3772: --- I was writing this when Jira went down. Will try to reproduce that comment. files created with the old replication count will expected to bump up to the new minimum upon restart automatically This is not an expected behavior. {{dfs.namenode.replication.min}} has two purposes: # Counting the blocks satisfying the new minimum replication during startup. # Controls the minimal number of replicas that must be created during pipeline in order to call the data transfer successful. Setting {{replication.min}} to higher value does not mean NN replicates blocks to that min. It means NN will wait for that many replicas to be reported during startup before exiting SafeMode. If you set it too high, this is one of the ways to never let NN go out of SafeMode automatically. SafeMode prohibits replication or deletion of blocks or modification of the namespace, so block replication will not happen until NN leaves SafeMode. If you are trying to increase block replication for all files in your file system you should use {{setReplication()}} on the root recursively. But replication will start only after SafeMode is OFF. I think we can change the semantics of this parameter to the percentage of blocks that satisfy the real replication of each file. Not a good idea. In general, changing semantics of existing parameters is confusing. And in particular, because this will make NN stay in SafeMode forever if some DataNodes don't come up. I think the question here is what you are trying to achieve with this? HDFS NN will hang in safe mode and never come out if we change the dfs.namenode.replication.min bigger. --- Key: HDFS-3772 URL: https://issues.apache.org/jira/browse/HDFS-3772 Project: Hadoop HDFS Issue Type: Bug Components: name-node Affects Versions: 2.0.0-alpha Reporter: Yanbo Liang If the NN restarts with a new minimum replication (dfs.namenode.replication.min), any files created with the old replication count will expected to bump up to the new minimum upon restart automatically. However, the real case is that if the NN restarts will a new minimum replication which is bigger than the old one, the NN will hang in safemode and never come out. The corresponding test case can pass is because we have missing some test coverage. It had been discussed in HDFS-3734. If the NN received enough number of reported block which is satisfying the new minimum replication, it will exit safe mode. However, if we change a bigger minimum replication, there will be no enough amount blocks which are satisfying the limitation. Look at the code segment in FSNamesystem.java: private synchronized void incrementSafeBlockCount(short replication) { if (replication == safeReplication) { this.blockSafe++; checkMode(); } } The DNs report blocks to NN and if the replication is equal to safeReplication(It is assigned by the new minimum replication.), we will increment blockSafe. But if we change a bigger minimum replication, all the blocks whose replications are lower than it can not satisfy this equal relationship. But actually the NN had received complete block information. It cause blockSafe will not increment as usual and not reach the enough amount to exit safe mode and then NN hangs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HDFS-3649) Port HDFS-385 to branch-1-win
[ https://issues.apache.org/jira/browse/HDFS-3649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sumadhur Reddy Bolli resolved HDFS-3649. Resolution: Fixed Release Note: Nicholas submitted the patches posted on HDF-385 to branch-1 and branch-1-win Port HDFS-385 to branch-1-win - Key: HDFS-3649 URL: https://issues.apache.org/jira/browse/HDFS-3649 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 1-win Reporter: Sumadhur Reddy Bolli Assignee: Sumadhur Reddy Bolli Added patch to HDF-385 to port the existing pluggable placement policy to branch-1-win -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3765) Namenode INITIALIZESHAREDEDITS should be able to initialize all shared storages
[ https://issues.apache.org/jira/browse/HDFS-3765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13434321#comment-13434321 ] Aaron T. Myers commented on HDFS-3765: -- +1, the latest patch looks good to me. Namenode INITIALIZESHAREDEDITS should be able to initialize all shared storages --- Key: HDFS-3765 URL: https://issues.apache.org/jira/browse/HDFS-3765 Project: Hadoop HDFS Issue Type: Improvement Components: ha Affects Versions: 2.1.0-alpha, 3.0.0 Reporter: Vinay Assignee: Vinay Attachments: HDFS-3765.patch, HDFS-3765.patch, HDFS-3765.patch, hdfs-3765.txt, hdfs-3765.txt, hdfs-3765.txt Currently, NameNode INITIALIZESHAREDEDITS provides ability to copy the edits files to file schema based shared storages when moving cluster from Non-HA environment to HA enabled environment. This Jira focuses on the following * Generalizing the logic of copying the edits to new shared storage so that any schema based shared storage can initialized for HA cluster. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3723) All commands should support meaningful --help
[ https://issues.apache.org/jira/browse/HDFS-3723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jing Zhao updated HDFS-3723: Attachment: HDFS-3723.003.patch All commands should support meaningful --help - Key: HDFS-3723 URL: https://issues.apache.org/jira/browse/HDFS-3723 Project: Hadoop HDFS Issue Type: Improvement Components: scripts, tools Affects Versions: 2.0.0-alpha Reporter: E. Sammer Assignee: Jing Zhao Attachments: HDFS-3723.001.patch, HDFS-3723.001.patch, HDFS-3723.002.patch, HDFS-3723.003.patch, HDFS-3723.patch, HDFS-3723.patch Some (sub)commands support -help or -h options for detailed help while others do not. Ideally, all commands should support meaningful help that works regardless of current state or configuration. For example, hdfs zkfc --help (or -h or -help) is not very useful. Option checking should occur before state / configuration checking. {code} [esammer@hadoop-fed01 ~]# hdfs zkfc --help Exception in thread main org.apache.hadoop.HadoopIllegalArgumentException: HA is not enabled for this namenode. at org.apache.hadoop.hdfs.tools.DFSZKFailoverController.setConf(DFSZKFailoverController.java:122) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:66) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84) at org.apache.hadoop.hdfs.tools.DFSZKFailoverController.main(DFSZKFailoverController.java:168) {code} This would go a long way toward better usability for ops staff. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3795) QJM: validate journal dir at startup
[ https://issues.apache.org/jira/browse/HDFS-3795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13434326#comment-13434326 ] Aaron T. Myers commented on HDFS-3795: -- +1, the updated patch looks good to me. QJM: validate journal dir at startup Key: HDFS-3795 URL: https://issues.apache.org/jira/browse/HDFS-3795 Project: Hadoop HDFS Issue Type: Sub-task Components: ha Affects Versions: QuorumJournalManager (HDFS-3077) Reporter: Todd Lipcon Assignee: Todd Lipcon Priority: Minor Attachments: hdfs-3795.txt, hdfs-3795.txt Currently, the JN does not validate the configured journal directory until it tries to write into it. This is counter-intuitive for users, since they would expect to find out about a misconfiguration at startup time, rather than on first access. Additionally, two testers accidentally configured the journal dir to be a URI, which the code accidentally understood as a relative path ({{CWD/file:/foo/bar}}. We should validate the config at startup to be an accessible absolute path. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3757) libhdfs: improve native stack traces
[ https://issues.apache.org/jira/browse/HDFS-3757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Colin Patrick McCabe updated HDFS-3757: --- Status: Open (was: Patch Available) libhdfs: improve native stack traces Key: HDFS-3757 URL: https://issues.apache.org/jira/browse/HDFS-3757 Project: Hadoop HDFS Issue Type: Improvement Components: libhdfs Affects Versions: 2.2.0-alpha Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Priority: Minor Attachments: HDFS-3757.001.patch When libhdfs crashes, we often don't get very good stack traces. It would be nice to get a better stack trace for the thread that crashed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (HDFS-2882) DN continues to start up, even if block pool fails to initialize
[ https://issues.apache.org/jira/browse/HDFS-2882?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eli Collins reassigned HDFS-2882: - Assignee: (was: Todd Lipcon) DN continues to start up, even if block pool fails to initialize Key: HDFS-2882 URL: https://issues.apache.org/jira/browse/HDFS-2882 Project: Hadoop HDFS Issue Type: Bug Components: data-node Affects Versions: 0.24.0 Reporter: Todd Lipcon Attachments: hdfs-2882.txt I started a DN on a machine that was completely out of space on one of its drives. I saw the following: 2012-02-02 09:56:50,499 FATAL org.apache.hadoop.hdfs.server.datanode.DataNode: Initialization failed for block pool Block pool BP-448349972-172.29.5.192-1323816762969 (storage id DS-507718931-172.29.5.194-11072-12978 42002148) service to styx01.sf.cloudera.com/172.29.5.192:8021 java.io.IOException: Mkdirs failed to create /data/1/scratch/todd/styx-datadir/current/BP-448349972-172.29.5.192-1323816762969/tmp at org.apache.hadoop.hdfs.server.datanode.FSDataset$BlockPoolSlice.init(FSDataset.java:335) but the DN continued to run, spewing NPEs when it tried to do block reports, etc. This was on the HDFS-1623 branch but may affect trunk as well. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3771) Namenode can't restart due to corrupt edit logs, timing issue with shutdown and edit log rolling
[ https://issues.apache.org/jira/browse/HDFS-3771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13434392#comment-13434392 ] patrick white commented on HDFS-3771: - Thanks very much Todd, appreciate the feedback and references, and the suggestion on using exit to try to reproduce this. Namenode can't restart due to corrupt edit logs, timing issue with shutdown and edit log rolling Key: HDFS-3771 URL: https://issues.apache.org/jira/browse/HDFS-3771 Project: Hadoop HDFS Issue Type: Bug Components: name-node Affects Versions: 0.23.3, 2.0.0-alpha Environment: QE, 20 node Federated cluster with 3 NNs and 15 DNs, using Kerberos based security Reporter: patrick white Priority: Critical Our 0.23.3 nightly HDFS regression suite encountered a particularly nasty issue recently, which resulted in the cluster's default Namenode being unable to restart, this was on a 20 node Federated cluster with security. The cause appears to be that the NN was just starting to roll its edit log when a shutdown occurred, the shutdown was intentional to restart the cluster as part of an automated test. The tests that were running do not appear to be the issue in themselves, the cluster was just wrapping up an adminReport subset and this failure case has not reproduce so far, nor was it failing previously. It looks like a chance occurrence of sending the shutdown just as the edit log roll was begun. From the NN log, the following sequence is noted: 1. an InvalidateBlocks operation had completed 2. FSNamesystem: Roll Edit Log from [Secondary Namenode IPaddr] 3. FSEditLog: Ending log segment 23963 4. FSEditLog: Starting log segment at 23967 4. NameNode: SHUTDOWN_MSG = the NN shuts down and then is restarted... 5. FSImageTransactionalStorageInspector: Logs beginning at txid 23967 were are all in-progress 6. FSImageTransactionalStorageInspector: Marking log at /grid/[PATH]/edits_inprogress_0023967 as corrupt since it has no transactions in it. 7. NameNode: Exception in namenode join [main]java.lang.IllegalStateException: No non-corrupt logs for txid 23967 = NN start attempts continue to cycle trying to restart but can't, failing on the same exception due to lack of non-corrupt edit logs If observations are correct and issue is from shutdown happening as edit logs are rolling, does the NN have an equivalent to the conventional fs 'sync' blocking action that should be called, or perhaps has a timing hole? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3792) Fix two findbugs introduced by HDFS-3695
[ https://issues.apache.org/jira/browse/HDFS-3792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13434412#comment-13434412 ] Hudson commented on HDFS-3792: -- Integrated in Hadoop-Common-trunk-Commit #2576 (See [https://builds.apache.org/job/Hadoop-Common-trunk-Commit/2576/]) HDFS-3792. Fix two findbugs introduced by HDFS-3695. Contributed by Todd Lipcon. (Revision 1372690) Result = FAILURE todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1372690 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSEditLog.java Fix two findbugs introduced by HDFS-3695 Key: HDFS-3792 URL: https://issues.apache.org/jira/browse/HDFS-3792 Project: Hadoop HDFS Issue Type: Bug Components: build, name-node Affects Versions: 3.0.0 Reporter: Todd Lipcon Assignee: Todd Lipcon Priority: Trivial Fix For: 3.0.0 Attachments: hdfs-3792.txt Accidentally introduced two trivial findbugs warnings in HDFS-3695. This JIRA is to fix them. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3790) test_fuse_dfs.c doesn't compile on centos 5
[ https://issues.apache.org/jira/browse/HDFS-3790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13434411#comment-13434411 ] Hudson commented on HDFS-3790: -- Integrated in Hadoop-Common-trunk-Commit #2576 (See [https://builds.apache.org/job/Hadoop-Common-trunk-Commit/2576/]) HDFS-3790. test_fuse_dfs.c doesn't compile on centos 5. Contributed by Colin Patrick McCabe. (Revision 1372676) Result = FAILURE atm : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1372676 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/native/fuse-dfs/test/test_fuse_dfs.c test_fuse_dfs.c doesn't compile on centos 5 --- Key: HDFS-3790 URL: https://issues.apache.org/jira/browse/HDFS-3790 Project: Hadoop HDFS Issue Type: Bug Components: fuse-dfs Affects Versions: 2.2.0-alpha Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Priority: Minor Fix For: 2.2.0-alpha Attachments: HDFS-3790.001.patch test_fuse_dfs.c uses execvpe, which doesn't exist in the version of glibc shipped on CentOS 5. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3658) TestDFSClientRetries#testNamenodeRestart failed
[ https://issues.apache.org/jira/browse/HDFS-3658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13434414#comment-13434414 ] Hudson commented on HDFS-3658: -- Integrated in Hadoop-Common-trunk-Commit #2576 (See [https://builds.apache.org/job/Hadoop-Common-trunk-Commit/2576/]) HDFS-3658. Fix bugs in TestDFSClientRetries and add more tests. (Revision 1372707) Result = FAILURE szetszwo : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1372707 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/MiniDFSCluster.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSClientRetries.java TestDFSClientRetries#testNamenodeRestart failed --- Key: HDFS-3658 URL: https://issues.apache.org/jira/browse/HDFS-3658 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.0.0-alpha Reporter: Eli Collins Assignee: Tsz Wo (Nicholas), SZE Fix For: 1.2.0, 2.2.0-alpha Attachments: h3658_20120808_b-1.patch, h3658_20120808.patch, test-log.txt Saw the following fail on a jenkins run: {noformat} Error Message expected:MD5-of-0MD5-of-512CRC32:f397fb3d9133d0a8f55854ea2bb268b0 but was:MD5-of-0MD5-of-0CRC32:70bc8f4b72a86921468bf8e8441dce51 Stacktrace junit.framework.AssertionFailedError: expected:MD5-of-0MD5-of-512CRC32:f397fb3d9133d0a8f55854ea2bb268b0 but was:MD5-of-0MD5-of-0CRC32:70bc8f4b72a86921468bf8e8441dce51 at junit.framework.Assert.fail(Assert.java:47) at junit.framework.Assert.failNotEquals(Assert.java:283) at junit.framework.Assert.assertEquals(Assert.java:64) at junit.framework.Assert.assertEquals(Assert.java:71) at org.apache.hadoop.hdfs.TestDFSClientRetries.testNamenodeRestart(TestDFSClientRetries.java:886) {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3695) Genericize format() to non-file JournalManagers
[ https://issues.apache.org/jira/browse/HDFS-3695?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13434413#comment-13434413 ] Hudson commented on HDFS-3695: -- Integrated in Hadoop-Common-trunk-Commit #2576 (See [https://builds.apache.org/job/Hadoop-Common-trunk-Commit/2576/]) HDFS-3792. Fix two findbugs introduced by HDFS-3695. Contributed by Todd Lipcon. (Revision 1372690) Result = FAILURE todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1372690 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSEditLog.java Genericize format() to non-file JournalManagers --- Key: HDFS-3695 URL: https://issues.apache.org/jira/browse/HDFS-3695 Project: Hadoop HDFS Issue Type: Sub-task Components: ha, name-node Affects Versions: QuorumJournalManager (HDFS-3077) Reporter: Todd Lipcon Assignee: Todd Lipcon Fix For: 3.0.0 Attachments: hdfs-3695.txt, hdfs-3695.txt, hdfs-3695.txt Currently, the namenode -format and namenode -initializeSharedEdits commands do not understand how to do anything with non-file-based shared storage. This affects both BookKeeperJournalManager and QuorumJournalManager. This JIRA is to plumb through the formatting of edits directories using pluggable journal manager implementations so that no separate step needs to be taken to format them -- the same commands will work for NFS-based storage or one of the alternate implementations. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3790) test_fuse_dfs.c doesn't compile on centos 5
[ https://issues.apache.org/jira/browse/HDFS-3790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13434416#comment-13434416 ] Hudson commented on HDFS-3790: -- Integrated in Hadoop-Hdfs-trunk-Commit #2641 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/2641/]) HDFS-3790. test_fuse_dfs.c doesn't compile on centos 5. Contributed by Colin Patrick McCabe. (Revision 1372676) Result = FAILURE atm : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1372676 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/native/fuse-dfs/test/test_fuse_dfs.c test_fuse_dfs.c doesn't compile on centos 5 --- Key: HDFS-3790 URL: https://issues.apache.org/jira/browse/HDFS-3790 Project: Hadoop HDFS Issue Type: Bug Components: fuse-dfs Affects Versions: 2.2.0-alpha Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Priority: Minor Fix For: 2.2.0-alpha Attachments: HDFS-3790.001.patch test_fuse_dfs.c uses execvpe, which doesn't exist in the version of glibc shipped on CentOS 5. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3695) Genericize format() to non-file JournalManagers
[ https://issues.apache.org/jira/browse/HDFS-3695?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13434418#comment-13434418 ] Hudson commented on HDFS-3695: -- Integrated in Hadoop-Hdfs-trunk-Commit #2641 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/2641/]) HDFS-3792. Fix two findbugs introduced by HDFS-3695. Contributed by Todd Lipcon. (Revision 1372690) Result = FAILURE todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1372690 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSEditLog.java Genericize format() to non-file JournalManagers --- Key: HDFS-3695 URL: https://issues.apache.org/jira/browse/HDFS-3695 Project: Hadoop HDFS Issue Type: Sub-task Components: ha, name-node Affects Versions: QuorumJournalManager (HDFS-3077) Reporter: Todd Lipcon Assignee: Todd Lipcon Fix For: 3.0.0 Attachments: hdfs-3695.txt, hdfs-3695.txt, hdfs-3695.txt Currently, the namenode -format and namenode -initializeSharedEdits commands do not understand how to do anything with non-file-based shared storage. This affects both BookKeeperJournalManager and QuorumJournalManager. This JIRA is to plumb through the formatting of edits directories using pluggable journal manager implementations so that no separate step needs to be taken to format them -- the same commands will work for NFS-based storage or one of the alternate implementations. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3792) Fix two findbugs introduced by HDFS-3695
[ https://issues.apache.org/jira/browse/HDFS-3792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13434417#comment-13434417 ] Hudson commented on HDFS-3792: -- Integrated in Hadoop-Hdfs-trunk-Commit #2641 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/2641/]) HDFS-3792. Fix two findbugs introduced by HDFS-3695. Contributed by Todd Lipcon. (Revision 1372690) Result = FAILURE todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1372690 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSEditLog.java Fix two findbugs introduced by HDFS-3695 Key: HDFS-3792 URL: https://issues.apache.org/jira/browse/HDFS-3792 Project: Hadoop HDFS Issue Type: Bug Components: build, name-node Affects Versions: 3.0.0 Reporter: Todd Lipcon Assignee: Todd Lipcon Priority: Trivial Fix For: 3.0.0 Attachments: hdfs-3792.txt Accidentally introduced two trivial findbugs warnings in HDFS-3695. This JIRA is to fix them. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3658) TestDFSClientRetries#testNamenodeRestart failed
[ https://issues.apache.org/jira/browse/HDFS-3658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13434419#comment-13434419 ] Hudson commented on HDFS-3658: -- Integrated in Hadoop-Hdfs-trunk-Commit #2641 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/2641/]) HDFS-3658. Fix bugs in TestDFSClientRetries and add more tests. (Revision 1372707) Result = FAILURE szetszwo : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1372707 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/MiniDFSCluster.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSClientRetries.java TestDFSClientRetries#testNamenodeRestart failed --- Key: HDFS-3658 URL: https://issues.apache.org/jira/browse/HDFS-3658 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.0.0-alpha Reporter: Eli Collins Assignee: Tsz Wo (Nicholas), SZE Fix For: 1.2.0, 2.2.0-alpha Attachments: h3658_20120808_b-1.patch, h3658_20120808.patch, test-log.txt Saw the following fail on a jenkins run: {noformat} Error Message expected:MD5-of-0MD5-of-512CRC32:f397fb3d9133d0a8f55854ea2bb268b0 but was:MD5-of-0MD5-of-0CRC32:70bc8f4b72a86921468bf8e8441dce51 Stacktrace junit.framework.AssertionFailedError: expected:MD5-of-0MD5-of-512CRC32:f397fb3d9133d0a8f55854ea2bb268b0 but was:MD5-of-0MD5-of-0CRC32:70bc8f4b72a86921468bf8e8441dce51 at junit.framework.Assert.fail(Assert.java:47) at junit.framework.Assert.failNotEquals(Assert.java:283) at junit.framework.Assert.assertEquals(Assert.java:64) at junit.framework.Assert.assertEquals(Assert.java:71) at org.apache.hadoop.hdfs.TestDFSClientRetries.testNamenodeRestart(TestDFSClientRetries.java:886) {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3649) Port HDFS-385 to branch-1-win
[ https://issues.apache.org/jira/browse/HDFS-3649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sumadhur Reddy Bolli updated HDFS-3649: --- Release Note: blockplacement policy is now ported to branch-1 and branch-1-win (was: Nicholas submitted the patches posted on HDF-385 to branch-1 and branch-1-win) Nicholas committed the patches posted on HDF-385 to branch-1 and branch-1-win Port HDFS-385 to branch-1-win - Key: HDFS-3649 URL: https://issues.apache.org/jira/browse/HDFS-3649 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 1-win Reporter: Sumadhur Reddy Bolli Assignee: Sumadhur Reddy Bolli Added patch to HDF-385 to port the existing pluggable placement policy to branch-1-win -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3731) 2.0 release upgrade must handle blocks being written from 1.0
[ https://issues.apache.org/jira/browse/HDFS-3731?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Colin Patrick McCabe updated HDFS-3731: --- Description: Release 2.0 upgrades must handle blocks being written to (bbw) files from 1.0 release. Problem reported by Brahma Reddy. The {{DataNode}} will only have one block pool after upgrading from a 1.x release. (This is because in the 1.x releases, there were no block pools-- or equivalently, everything was in the same block pool). During the upgrade, we should hardlink the block files from the {{blocksBeingWritten}} directory into the {{rbw}} directory of this block pool. Similarly, on {{-finalize}}, we should delete the {{blocksBeingWritten}} directory. was: Release 2.0 upgrades must handle blocks being written to (bbw) files from 1.0 release. Problem reported by Brahma Reddy. 2.0 release upgrade must handle blocks being written from 1.0 - Key: HDFS-3731 URL: https://issues.apache.org/jira/browse/HDFS-3731 Project: Hadoop HDFS Issue Type: Bug Components: data-node Affects Versions: 2.0.0-alpha Reporter: Suresh Srinivas Assignee: Colin Patrick McCabe Priority: Blocker Attachments: HDFS-3731.002.patch, HDFS-3731.003.patch Release 2.0 upgrades must handle blocks being written to (bbw) files from 1.0 release. Problem reported by Brahma Reddy. The {{DataNode}} will only have one block pool after upgrading from a 1.x release. (This is because in the 1.x releases, there were no block pools-- or equivalently, everything was in the same block pool). During the upgrade, we should hardlink the block files from the {{blocksBeingWritten}} directory into the {{rbw}} directory of this block pool. Similarly, on {{-finalize}}, we should delete the {{blocksBeingWritten}} directory. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3731) 2.0 release upgrade must handle blocks being written from 1.0
[ https://issues.apache.org/jira/browse/HDFS-3731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13434432#comment-13434432 ] Colin Patrick McCabe commented on HDFS-3731: I added a description of the approach to the Description field. 2.0 release upgrade must handle blocks being written from 1.0 - Key: HDFS-3731 URL: https://issues.apache.org/jira/browse/HDFS-3731 Project: Hadoop HDFS Issue Type: Bug Components: data-node Affects Versions: 2.0.0-alpha Reporter: Suresh Srinivas Assignee: Colin Patrick McCabe Priority: Blocker Attachments: HDFS-3731.002.patch, HDFS-3731.003.patch Release 2.0 upgrades must handle blocks being written to (bbw) files from 1.0 release. Problem reported by Brahma Reddy. The {{DataNode}} will only have one block pool after upgrading from a 1.x release. (This is because in the 1.x releases, there were no block pools-- or equivalently, everything was in the same block pool). During the upgrade, we should hardlink the block files from the {{blocksBeingWritten}} directory into the {{rbw}} directory of this block pool. Similarly, on {{-finalize}}, we should delete the {{blocksBeingWritten}} directory. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3790) test_fuse_dfs.c doesn't compile on centos 5
[ https://issues.apache.org/jira/browse/HDFS-3790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13434450#comment-13434450 ] Hudson commented on HDFS-3790: -- Integrated in Hadoop-Mapreduce-trunk-Commit #2599 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/2599/]) HDFS-3790. test_fuse_dfs.c doesn't compile on centos 5. Contributed by Colin Patrick McCabe. (Revision 1372676) Result = FAILURE atm : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1372676 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/native/fuse-dfs/test/test_fuse_dfs.c test_fuse_dfs.c doesn't compile on centos 5 --- Key: HDFS-3790 URL: https://issues.apache.org/jira/browse/HDFS-3790 Project: Hadoop HDFS Issue Type: Bug Components: fuse-dfs Affects Versions: 2.2.0-alpha Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Priority: Minor Fix For: 2.2.0-alpha Attachments: HDFS-3790.001.patch test_fuse_dfs.c uses execvpe, which doesn't exist in the version of glibc shipped on CentOS 5. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3792) Fix two findbugs introduced by HDFS-3695
[ https://issues.apache.org/jira/browse/HDFS-3792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13434451#comment-13434451 ] Hudson commented on HDFS-3792: -- Integrated in Hadoop-Mapreduce-trunk-Commit #2599 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/2599/]) HDFS-3792. Fix two findbugs introduced by HDFS-3695. Contributed by Todd Lipcon. (Revision 1372690) Result = FAILURE todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1372690 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSEditLog.java Fix two findbugs introduced by HDFS-3695 Key: HDFS-3792 URL: https://issues.apache.org/jira/browse/HDFS-3792 Project: Hadoop HDFS Issue Type: Bug Components: build, name-node Affects Versions: 3.0.0 Reporter: Todd Lipcon Assignee: Todd Lipcon Priority: Trivial Fix For: 3.0.0 Attachments: hdfs-3792.txt Accidentally introduced two trivial findbugs warnings in HDFS-3695. This JIRA is to fix them. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3658) TestDFSClientRetries#testNamenodeRestart failed
[ https://issues.apache.org/jira/browse/HDFS-3658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13434453#comment-13434453 ] Hudson commented on HDFS-3658: -- Integrated in Hadoop-Mapreduce-trunk-Commit #2599 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/2599/]) HDFS-3658. Fix bugs in TestDFSClientRetries and add more tests. (Revision 1372707) Result = FAILURE szetszwo : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1372707 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/MiniDFSCluster.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSClientRetries.java TestDFSClientRetries#testNamenodeRestart failed --- Key: HDFS-3658 URL: https://issues.apache.org/jira/browse/HDFS-3658 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.0.0-alpha Reporter: Eli Collins Assignee: Tsz Wo (Nicholas), SZE Fix For: 1.2.0, 2.2.0-alpha Attachments: h3658_20120808_b-1.patch, h3658_20120808.patch, test-log.txt Saw the following fail on a jenkins run: {noformat} Error Message expected:MD5-of-0MD5-of-512CRC32:f397fb3d9133d0a8f55854ea2bb268b0 but was:MD5-of-0MD5-of-0CRC32:70bc8f4b72a86921468bf8e8441dce51 Stacktrace junit.framework.AssertionFailedError: expected:MD5-of-0MD5-of-512CRC32:f397fb3d9133d0a8f55854ea2bb268b0 but was:MD5-of-0MD5-of-0CRC32:70bc8f4b72a86921468bf8e8441dce51 at junit.framework.Assert.fail(Assert.java:47) at junit.framework.Assert.failNotEquals(Assert.java:283) at junit.framework.Assert.assertEquals(Assert.java:64) at junit.framework.Assert.assertEquals(Assert.java:71) at org.apache.hadoop.hdfs.TestDFSClientRetries.testNamenodeRestart(TestDFSClientRetries.java:886) {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3765) Namenode INITIALIZESHAREDEDITS should be able to initialize all shared storages
[ https://issues.apache.org/jira/browse/HDFS-3765?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Todd Lipcon updated HDFS-3765: -- Attachment: hdfs-3765-branch-2.txt Attaching branch-2 patch. It's the same except for some resolved conflicts on the imports. Will commit both patches momentarily. Thanks, Vinay. Namenode INITIALIZESHAREDEDITS should be able to initialize all shared storages --- Key: HDFS-3765 URL: https://issues.apache.org/jira/browse/HDFS-3765 Project: Hadoop HDFS Issue Type: Improvement Components: ha Affects Versions: 2.1.0-alpha, 3.0.0 Reporter: Vinay Assignee: Vinay Attachments: hdfs-3765-branch-2.txt, HDFS-3765.patch, HDFS-3765.patch, HDFS-3765.patch, hdfs-3765.txt, hdfs-3765.txt, hdfs-3765.txt Currently, NameNode INITIALIZESHAREDEDITS provides ability to copy the edits files to file schema based shared storages when moving cluster from Non-HA environment to HA enabled environment. This Jira focuses on the following * Generalizing the logic of copying the edits to new shared storage so that any schema based shared storage can initialized for HA cluster. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3788) distcp can't copy large files using webhdfs due to missing Content-Length header
[ https://issues.apache.org/jira/browse/HDFS-3788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13434463#comment-13434463 ] Jason Lowe commented on HDFS-3788: -- I tested out the patch on trunk and am unable to reproduce Eli's issue. Without the patch both -get and distcp via webhdfs fail, but after the patch I can successfully -get and distcp large files. This is on a pseudo-distributed tarball without security, distcp is {{hadoop distcp webhdfs://localhost:50070/user/someuser/distcpsrc hdfs://localhost:8020/user/someuser/distcpdest}} where distcpsrc/ contains a 3GB file. distcp can't copy large files using webhdfs due to missing Content-Length header Key: HDFS-3788 URL: https://issues.apache.org/jira/browse/HDFS-3788 Project: Hadoop HDFS Issue Type: Bug Components: webhdfs Affects Versions: 0.23.3, 2.0.0-alpha Reporter: Eli Collins Assignee: Tsz Wo (Nicholas), SZE Priority: Critical Attachments: distcp-webhdfs-errors.txt, h3788_20120813.patch, h3788_20120814.patch The following command fails when data1 contains a 3gb file. It passes when using hftp or when the directory just contains smaller (2gb) files, so looks like a webhdfs issue with large files. {{hadoop distcp webhdfs://eli-thinkpad:50070/user/eli/data1 hdfs://localhost:8020/user/eli/data2}} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3796) Speed up edit log tests by avoiding fsync()
[ https://issues.apache.org/jira/browse/HDFS-3796?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Todd Lipcon updated HDFS-3796: -- Attachment: hdfs-3796.txt Attached patch adds the same hook to the other test cases that Colin suggested. Speed up edit log tests by avoiding fsync() --- Key: HDFS-3796 URL: https://issues.apache.org/jira/browse/HDFS-3796 Project: Hadoop HDFS Issue Type: Improvement Components: test Affects Versions: 3.0.0, 2.2.0-alpha Reporter: Todd Lipcon Assignee: Todd Lipcon Priority: Minor Attachments: hdfs-3796.txt, hdfs-3796.txt Our edit log tests are very slow because they incur a lot of fsyncs as they write out transactions. Since fsync() has no effect except in the case of power outages or system crashes, and we don't care about power outages in the context of tests, we can safely skip the fsync without any loss in coverage. In my tests, this sped up TestEditLog by about 5x. The testFuzzSequences test case improved from ~83 seconds with fsync to about 5 seconds without. These results are from my SSD laptop - they are probably even more drastic on spinning media. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3731) 2.0 release upgrade must handle blocks being written from 1.0
[ https://issues.apache.org/jira/browse/HDFS-3731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13434467#comment-13434467 ] Suresh Srinivas commented on HDFS-3731: --- Colin, is the recovery mechanism for bbw blocks in 1.x and rbw in 2.x compatible? 2.0 release upgrade must handle blocks being written from 1.0 - Key: HDFS-3731 URL: https://issues.apache.org/jira/browse/HDFS-3731 Project: Hadoop HDFS Issue Type: Bug Components: data-node Affects Versions: 2.0.0-alpha Reporter: Suresh Srinivas Assignee: Colin Patrick McCabe Priority: Blocker Attachments: HDFS-3731.002.patch, HDFS-3731.003.patch Release 2.0 upgrades must handle blocks being written to (bbw) files from 1.0 release. Problem reported by Brahma Reddy. The {{DataNode}} will only have one block pool after upgrading from a 1.x release. (This is because in the 1.x releases, there were no block pools-- or equivalently, everything was in the same block pool). During the upgrade, we should hardlink the block files from the {{blocksBeingWritten}} directory into the {{rbw}} directory of this block pool. Similarly, on {{-finalize}}, we should delete the {{blocksBeingWritten}} directory. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3798) Avoid throwing NPE when finalizeSegment() is called on invalid segment
[ https://issues.apache.org/jira/browse/HDFS-3798?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Todd Lipcon updated HDFS-3798: -- Attachment: hdfs-3798.txt addresses the test problem above. I'll commit this to the branch later today. Avoid throwing NPE when finalizeSegment() is called on invalid segment -- Key: HDFS-3798 URL: https://issues.apache.org/jira/browse/HDFS-3798 Project: Hadoop HDFS Issue Type: Sub-task Components: ha Affects Versions: QuorumJournalManager (HDFS-3077) Reporter: Todd Lipcon Assignee: Todd Lipcon Priority: Trivial Attachments: hdfs-3798.txt, hdfs-3798.txt Currently, if the client calls finalizeLogSegment() on a segment which doesn't exist on the JournalNode side, it throws an NPE. Instead it should throw a more intelligible exception. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3797) QJM: add segment txid as a parameter to journal() RPC
[ https://issues.apache.org/jira/browse/HDFS-3797?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Todd Lipcon updated HDFS-3797: -- Attachment: hdfs-3797.txt Attached patch applies on top of HDFS-3798, HDFS-3799 QJM: add segment txid as a parameter to journal() RPC - Key: HDFS-3797 URL: https://issues.apache.org/jira/browse/HDFS-3797 Project: Hadoop HDFS Issue Type: Sub-task Components: ha Affects Versions: QuorumJournalManager (HDFS-3077) Reporter: Todd Lipcon Assignee: Todd Lipcon Priority: Minor Attachments: hdfs-3797.txt During fault testing of QJM, I saw the following issue: 1) NN sends txn 5 to JN 2) NN gets partitioned from JN while JN remains up. The next two RPCs are missed while the partition has happened: 2a) finalizeSegment(1-5) 2b) startSegment(6) 3) NN sends txn 6 to JN This caused one of the JNs to end up with a segment 1-10 while the others had two segments; 1-5 and 6-10. This broke some invariants of the QJM protocol and prevented the recovery protocol from running properly. This can be addressed on the client side by HDFS-3726, which would cause the NN to not send the RPC in #3. But it makes sense to also add an extra safety check here on the server side: with every journal() call, we can send the segment's txid. Then if the JN and the client get out of sync, the JN can reject the RPCs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3788) distcp can't copy large files using webhdfs due to missing Content-Length header
[ https://issues.apache.org/jira/browse/HDFS-3788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13434478#comment-13434478 ] Daryn Sharp commented on HDFS-3788: --- I believe it's legitimate to send a content-length (if known) with a chunked response. You may want to check for chunking only if there's not a content-length. It's up to you. I think a test case would be invaluable since the file size issue has reared itself a few times. Could you add a test that uses a mock? distcp can't copy large files using webhdfs due to missing Content-Length header Key: HDFS-3788 URL: https://issues.apache.org/jira/browse/HDFS-3788 Project: Hadoop HDFS Issue Type: Bug Components: webhdfs Affects Versions: 0.23.3, 2.0.0-alpha Reporter: Eli Collins Assignee: Tsz Wo (Nicholas), SZE Priority: Critical Attachments: distcp-webhdfs-errors.txt, h3788_20120813.patch, h3788_20120814.patch The following command fails when data1 contains a 3gb file. It passes when using hftp or when the directory just contains smaller (2gb) files, so looks like a webhdfs issue with large files. {{hadoop distcp webhdfs://eli-thinkpad:50070/user/eli/data1 hdfs://localhost:8020/user/eli/data2}} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HDFS-3802) StartupOption.name in HdfsServerConstants should be final
Jing Zhao created HDFS-3802: --- Summary: StartupOption.name in HdfsServerConstants should be final Key: HDFS-3802 URL: https://issues.apache.org/jira/browse/HDFS-3802 Project: Hadoop HDFS Issue Type: Improvement Reporter: Jing Zhao Priority: Trivial In HdfsServerConstants, it may be better to define StartupOption.name as final since it will not and should not be modified after initialization. For example, in NameNode.java, the printUsage function prints out multiple startup options' names. The modification/change of the StartupOption.name may cause invalid usage message. Although right now there is no methods to change/set the value of StartupOption.name, it is better to add the final keyword to make sure. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3802) StartupOption.name in HdfsServerConstants should be final
[ https://issues.apache.org/jira/browse/HDFS-3802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jing Zhao updated HDFS-3802: Attachment: HDFS-3802.patch StartupOption.name in HdfsServerConstants should be final - Key: HDFS-3802 URL: https://issues.apache.org/jira/browse/HDFS-3802 Project: Hadoop HDFS Issue Type: Improvement Reporter: Jing Zhao Priority: Trivial Attachments: HDFS-3802.patch In HdfsServerConstants, it may be better to define StartupOption.name as final since it will not and should not be modified after initialization. For example, in NameNode.java, the printUsage function prints out multiple startup options' names. The modification/change of the StartupOption.name may cause invalid usage message. Although right now there is no methods to change/set the value of StartupOption.name, it is better to add the final keyword to make sure. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3802) StartupOption.name in HdfsServerConstants should be final
[ https://issues.apache.org/jira/browse/HDFS-3802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jing Zhao updated HDFS-3802: Affects Version/s: 3.0.0 Fix Version/s: 3.0.0 StartupOption.name in HdfsServerConstants should be final - Key: HDFS-3802 URL: https://issues.apache.org/jira/browse/HDFS-3802 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 3.0.0 Reporter: Jing Zhao Priority: Trivial Fix For: 3.0.0 Attachments: HDFS-3802.patch In HdfsServerConstants, it may be better to define StartupOption.name as final since it will not and should not be modified after initialization. For example, in NameNode.java, the printUsage function prints out multiple startup options' names. The modification/change of the StartupOption.name may cause invalid usage message. Although right now there is no methods to change/set the value of StartupOption.name, it is better to add the final keyword to make sure. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3718) Datanode won't shutdown because of runaway DataBlockScanner thread
[ https://issues.apache.org/jira/browse/HDFS-3718?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daryn Sharp updated HDFS-3718: -- Fix Version/s: 2.1.0-alpha 0.23.3 I've committed to trunk, branches for 2.x, and 23. Thanks Kihwal! Datanode won't shutdown because of runaway DataBlockScanner thread -- Key: HDFS-3718 URL: https://issues.apache.org/jira/browse/HDFS-3718 Project: Hadoop HDFS Issue Type: Bug Components: data-node Affects Versions: 2.0.1-alpha Reporter: Kihwal Lee Assignee: Kihwal Lee Priority: Critical Fix For: 0.23.3, 2.1.0-alpha, 3.0.0, 2.2.0-alpha Attachments: hdfs-3718.patch.txt Datanode sometimes does not shutdown because the block pool scanner thread keeps running. It prints out Starting a new period every five seconds, even after {{shutdown()}} is called. Somehow the interrupt is missed. {{DataBlockScanner}} will also terminate if {{datanode.shouldRun}} is false, but in {{DataNode#shutdown}}, {{DataBlockScanner#shutdown()}} is invoked before it is being set to false. Is there any reason why {{datanode.shouldRun}} is set to false later? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3718) Datanode won't shutdown because of runaway DataBlockScanner thread
[ https://issues.apache.org/jira/browse/HDFS-3718?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daryn Sharp updated HDFS-3718: -- Resolution: Fixed Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) Datanode won't shutdown because of runaway DataBlockScanner thread -- Key: HDFS-3718 URL: https://issues.apache.org/jira/browse/HDFS-3718 Project: Hadoop HDFS Issue Type: Bug Components: data-node Affects Versions: 2.0.1-alpha Reporter: Kihwal Lee Assignee: Kihwal Lee Priority: Critical Fix For: 0.23.3, 2.1.0-alpha, 3.0.0, 2.2.0-alpha Attachments: hdfs-3718.patch.txt Datanode sometimes does not shutdown because the block pool scanner thread keeps running. It prints out Starting a new period every five seconds, even after {{shutdown()}} is called. Somehow the interrupt is missed. {{DataBlockScanner}} will also terminate if {{datanode.shouldRun}} is false, but in {{DataNode#shutdown}}, {{DataBlockScanner#shutdown()}} is invoked before it is being set to false. Is there any reason why {{datanode.shouldRun}} is set to false later? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3793) Implement genericized format() in QJM
[ https://issues.apache.org/jira/browse/HDFS-3793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13434546#comment-13434546 ] Andrew Purtell commented on HDFS-3793: -- +1 Applied this patch after the generic support and confirmed with manual testing. Implement genericized format() in QJM - Key: HDFS-3793 URL: https://issues.apache.org/jira/browse/HDFS-3793 Project: Hadoop HDFS Issue Type: Sub-task Components: ha Affects Versions: QuorumJournalManager (HDFS-3077) Reporter: Todd Lipcon Assignee: Todd Lipcon Attachments: hdfs-3793.txt HDFS-3695 added the ability for non-File journal managers to tie into calls like NameNode -format. This JIRA is to implement format() for QuorumJournalManager. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3723) All commands should support meaningful --help
[ https://issues.apache.org/jira/browse/HDFS-3723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13434549#comment-13434549 ] Hadoop QA commented on HDFS-3723: - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12540903/HDFS-3723.003.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 2 new or modified test files. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 javadoc. The javadoc tool did not generate any warning messages. +1 eclipse:eclipse. The patch built with eclipse:eclipse. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests in hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.fs.TestS3_LocalFileContextURI org.apache.hadoop.fs.s3native.TestInMemoryNativeS3FileSystemContract org.apache.hadoop.fs.TestLocal_S3FileContextURI org.apache.hadoop.fs.s3.TestInMemoryS3FileSystemContract org.apache.hadoop.hdfs.server.blockmanagement.TestBlocksWithNotEnoughRacks +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/3003//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/3003//console This message is automatically generated. All commands should support meaningful --help - Key: HDFS-3723 URL: https://issues.apache.org/jira/browse/HDFS-3723 Project: Hadoop HDFS Issue Type: Improvement Components: scripts, tools Affects Versions: 2.0.0-alpha Reporter: E. Sammer Assignee: Jing Zhao Attachments: HDFS-3723.001.patch, HDFS-3723.001.patch, HDFS-3723.002.patch, HDFS-3723.003.patch, HDFS-3723.patch, HDFS-3723.patch Some (sub)commands support -help or -h options for detailed help while others do not. Ideally, all commands should support meaningful help that works regardless of current state or configuration. For example, hdfs zkfc --help (or -h or -help) is not very useful. Option checking should occur before state / configuration checking. {code} [esammer@hadoop-fed01 ~]# hdfs zkfc --help Exception in thread main org.apache.hadoop.HadoopIllegalArgumentException: HA is not enabled for this namenode. at org.apache.hadoop.hdfs.tools.DFSZKFailoverController.setConf(DFSZKFailoverController.java:122) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:66) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84) at org.apache.hadoop.hdfs.tools.DFSZKFailoverController.main(DFSZKFailoverController.java:168) {code} This would go a long way toward better usability for ops staff. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3793) Implement genericized format() in QJM
[ https://issues.apache.org/jira/browse/HDFS-3793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13434551#comment-13434551 ] Aaron T. Myers commented on HDFS-3793: -- Patch looks pretty good to me. Two small comments: # Seems like you should make the QuorumJournalManager#format and QuroumJournalManager#hasSomeData timeouts configurable, or at least use constants and add a comment or two justifying how you chose those values. # I think I see the reasoning behind the need for the call to unlockAll in JNStorage#format, but you might want to add a comment explaining why it's necessary. Also, if this happens, when will the storage be locked again? Might want to add a comment explaining that as well. +1 once these are addressed. Implement genericized format() in QJM - Key: HDFS-3793 URL: https://issues.apache.org/jira/browse/HDFS-3793 Project: Hadoop HDFS Issue Type: Sub-task Components: ha Affects Versions: QuorumJournalManager (HDFS-3077) Reporter: Todd Lipcon Assignee: Todd Lipcon Attachments: hdfs-3793.txt HDFS-3695 added the ability for non-File journal managers to tie into calls like NameNode -format. This JIRA is to implement format() for QuorumJournalManager. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3150) Add option for clients to contact DNs via hostname
[ https://issues.apache.org/jira/browse/HDFS-3150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13434557#comment-13434557 ] Hudson commented on HDFS-3150: -- Integrated in Hadoop-Mapreduce-trunk-Commit #2601 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/2601/]) HDFS-3150. Add option for clients to contact DNs via hostname. Contributed by Eli Collins (Revision 1373094) Result = FAILURE eli : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1373094 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/BlockReaderLocal.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSClient.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSInputStream.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSOutputStream.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSUtil.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/DatanodeID.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/ClientDatanodeProtocolTranslatorPB.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DNConf.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataXceiver.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FileChecksumServlets.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/MiniDFSCluster.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSClientRetries.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDistributedFileSystem.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestFileCreation.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestHftpFileSystem.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestMiniDFSCluster.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestShortCircuitLocalRead.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/security/token/block/TestBlockToken.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/DataNodeTestUtils.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/TestInterDatanodeProtocol.java Add option for clients to contact DNs via hostname -- Key: HDFS-3150 URL: https://issues.apache.org/jira/browse/HDFS-3150 Project: Hadoop HDFS Issue Type: New Feature Components: data-node, hdfs client Affects Versions: 1.0.0, 2.0.0-alpha Reporter: Eli Collins Assignee: Eli Collins Fix For: 1.1.0 Attachments: hdfs-3150-b1.txt, hdfs-3150-b1.txt, hdfs-3150.txt, hdfs-3150.txt, hdfs-3150.txt, hdfs-3150.txt The DN listens on multiple IP addresses (the default {{dfs.datanode.address}} is the wildcard) however per HADOOP-6867 only the source address (IP) of the registration is given to clients. HADOOP-985 made clients access datanodes by IP primarily to avoid the latency of a DNS lookup, this had the side effect of breaking DN multihoming (the client can not route the IP exposed by the NN if the DN registers with an interface that has a cluster-private IP). To fix this let's add back the option for Datanodes to be accessed by hostname. This can be done by: # Modifying the primary field of the Datanode descriptor to be the hostname, or # Modifying Client/Datanode - Datanode access use the hostname field instead of the IP Approach #2 does not require an incompatible client protocol change, and is much less invasive. It minimizes the scope of modification to just places where clients and Datanodes connect, vs changing all uses of Datanode
[jira] [Commented] (HDFS-3731) 2.0 release upgrade must handle blocks being written from 1.0
[ https://issues.apache.org/jira/browse/HDFS-3731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13434576#comment-13434576 ] Colin Patrick McCabe commented on HDFS-3731: The design doc for HDFS-265 says: bq. RWR (Replica Waiting to be Recovered): If a DataNode dies and restarts, all its rbw replicas change to be in the rwr state. Rwr replicas will not be in any pipeline and therefore will not receive any new bytes. They will either become out of date or will participate in a lease recovery if the client also dies. It seems to me that by putting the blocks into the rbw directory, what will happen when the 2.x DataNode is started is that the blocks will participate in lease recovery after a few minutes have gone past. There is a unit test in this patch which short-cuts this process by manually invoking lease recovery on the files and then verifying that they can be read. Is there any more documentation about the lease recovery process? As far as I can tell, it seems to work fine on the files in this patch. It might be useful to test waiting for automatic lease recovery to be triggered rather than invoking it manually. 2.0 release upgrade must handle blocks being written from 1.0 - Key: HDFS-3731 URL: https://issues.apache.org/jira/browse/HDFS-3731 Project: Hadoop HDFS Issue Type: Bug Components: data-node Affects Versions: 2.0.0-alpha Reporter: Suresh Srinivas Assignee: Colin Patrick McCabe Priority: Blocker Attachments: HDFS-3731.002.patch, HDFS-3731.003.patch Release 2.0 upgrades must handle blocks being written to (bbw) files from 1.0 release. Problem reported by Brahma Reddy. The {{DataNode}} will only have one block pool after upgrading from a 1.x release. (This is because in the 1.x releases, there were no block pools-- or equivalently, everything was in the same block pool). During the upgrade, we should hardlink the block files from the {{blocksBeingWritten}} directory into the {{rbw}} directory of this block pool. Similarly, on {{-finalize}}, we should delete the {{blocksBeingWritten}} directory. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3150) Add option for clients to contact DNs via hostname
[ https://issues.apache.org/jira/browse/HDFS-3150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eli Collins updated HDFS-3150: -- Resolution: Fixed Fix Version/s: 2.2.0-alpha Target Version/s: (was: 2.2.0-alpha) Status: Resolved (was: Patch Available) I've committed this and merged to branch-2. Add option for clients to contact DNs via hostname -- Key: HDFS-3150 URL: https://issues.apache.org/jira/browse/HDFS-3150 Project: Hadoop HDFS Issue Type: New Feature Components: data-node, hdfs client Affects Versions: 1.0.0, 2.0.0-alpha Reporter: Eli Collins Assignee: Eli Collins Fix For: 1.1.0, 2.2.0-alpha Attachments: hdfs-3150-b1.txt, hdfs-3150-b1.txt, hdfs-3150.txt, hdfs-3150.txt, hdfs-3150.txt, hdfs-3150.txt The DN listens on multiple IP addresses (the default {{dfs.datanode.address}} is the wildcard) however per HADOOP-6867 only the source address (IP) of the registration is given to clients. HADOOP-985 made clients access datanodes by IP primarily to avoid the latency of a DNS lookup, this had the side effect of breaking DN multihoming (the client can not route the IP exposed by the NN if the DN registers with an interface that has a cluster-private IP). To fix this let's add back the option for Datanodes to be accessed by hostname. This can be done by: # Modifying the primary field of the Datanode descriptor to be the hostname, or # Modifying Client/Datanode - Datanode access use the hostname field instead of the IP Approach #2 does not require an incompatible client protocol change, and is much less invasive. It minimizes the scope of modification to just places where clients and Datanodes connect, vs changing all uses of Datanode identifiers. New client and Datanode configuration options are introduced: - {{dfs.client.use.datanode.hostname}} indicates all client to datanode connections should use the datanode hostname (as clients outside cluster may not be able to route the IP) - {{dfs.datanode.use.datanode.hostname}} indicates whether Datanodes should use hostnames when connecting to other Datanodes for data transfer If the configuration options are not used, there is no change in the current behavior. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HDFS-3803) BlockPoolSliceScanner new work period notice is very chatty at INFO level
Andrew Purtell created HDFS-3803: Summary: BlockPoolSliceScanner new work period notice is very chatty at INFO level Key: HDFS-3803 URL: https://issues.apache.org/jira/browse/HDFS-3803 Project: Hadoop HDFS Issue Type: Bug Components: data-node Affects Versions: 2.1.0-alpha, 2.0.1-alpha Environment: Hadoop 2.0.1-alpha-SNAPSHOT Reporter: Andrew Purtell Priority: Minor One line of ~140 chars logged every 5 seconds. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3788) distcp can't copy large files using webhdfs due to missing Content-Length header
[ https://issues.apache.org/jira/browse/HDFS-3788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13434593#comment-13434593 ] Hadoop QA commented on HDFS-3788: - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12540898/h3788_20120814.patch against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 javadoc. The javadoc tool did not generate any warning messages. +1 eclipse:eclipse. The patch built with eclipse:eclipse. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.web.TestWebHDFS org.apache.hadoop.hdfs.web.TestWebHdfsWithMultipleNameNodes org.apache.hadoop.hdfs.TestHftpFileSystem org.apache.hadoop.hdfs.server.blockmanagement.TestBlocksWithNotEnoughRacks org.apache.hadoop.hdfs.TestByteRangeInputStream org.apache.hadoop.hdfs.TestDFSShell org.apache.hadoop.hdfs.web.TestFSMainOperationsWebHdfs org.apache.hadoop.hdfs.web.TestWebHdfsFileSystemContract +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/3005//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/3005//console This message is automatically generated. distcp can't copy large files using webhdfs due to missing Content-Length header Key: HDFS-3788 URL: https://issues.apache.org/jira/browse/HDFS-3788 Project: Hadoop HDFS Issue Type: Bug Components: webhdfs Affects Versions: 0.23.3, 2.0.0-alpha Reporter: Eli Collins Assignee: Tsz Wo (Nicholas), SZE Priority: Critical Attachments: distcp-webhdfs-errors.txt, h3788_20120813.patch, h3788_20120814.patch The following command fails when data1 contains a 3gb file. It passes when using hftp or when the directory just contains smaller (2gb) files, so looks like a webhdfs issue with large files. {{hadoop distcp webhdfs://eli-thinkpad:50070/user/eli/data1 hdfs://localhost:8020/user/eli/data2}} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3803) BlockPoolSliceScanner new work period notice is very chatty at INFO level
[ https://issues.apache.org/jira/browse/HDFS-3803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HDFS-3803: - Attachment: HDFS-3803.patch Trivial patch applies to both trunk and branch-2. BlockPoolSliceScanner new work period notice is very chatty at INFO level - Key: HDFS-3803 URL: https://issues.apache.org/jira/browse/HDFS-3803 Project: Hadoop HDFS Issue Type: Bug Components: data-node Affects Versions: 2.1.0-alpha, 2.0.1-alpha Environment: Hadoop 2.0.1-alpha-SNAPSHOT Reporter: Andrew Purtell Priority: Minor Attachments: HDFS-3803.patch One line of ~140 chars logged every 5 seconds. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3799) QJM: handle empty log segments during recovery
[ https://issues.apache.org/jira/browse/HDFS-3799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13434606#comment-13434606 ] Aaron T. Myers commented on HDFS-3799: -- Patch looks really good. The tests in particular are very solid. Two nits: # sp: Synchronziing # Recommend replacing the three testOutOfSyncAtBeginningOfSegmentX methods with a loop from 0-2. Feel free to punt if you think this is clearer. +1 once these are addressed. QJM: handle empty log segments during recovery -- Key: HDFS-3799 URL: https://issues.apache.org/jira/browse/HDFS-3799 Project: Hadoop HDFS Issue Type: Sub-task Components: ha Affects Versions: QuorumJournalManager (HDFS-3077) Reporter: Todd Lipcon Assignee: Todd Lipcon Attachments: hdfs-3799.txt One of the cases not yet handled in the QJM branch is the one where either the writer or the journal node crashes after startLogSegment() but before it has written its first transaction to the log. We currently have TODO assertions in the code which fire in these cases. This JIRA is to deal with these cases. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3797) QJM: add segment txid as a parameter to journal() RPC
[ https://issues.apache.org/jira/browse/HDFS-3797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13434626#comment-13434626 ] Aaron T. Myers commented on HDFS-3797: -- Patch looks pretty good to me. One question: have you considered adding a test case that ensures that a JN which experiences this scenario will return to participating in the quorum after the next finalize/new segment? Nit: looks like the method comment for testMissFinalizeAndNextStart got messed up a little bit: + **/ QJM: add segment txid as a parameter to journal() RPC - Key: HDFS-3797 URL: https://issues.apache.org/jira/browse/HDFS-3797 Project: Hadoop HDFS Issue Type: Sub-task Components: ha Affects Versions: QuorumJournalManager (HDFS-3077) Reporter: Todd Lipcon Assignee: Todd Lipcon Priority: Minor Attachments: hdfs-3797.txt During fault testing of QJM, I saw the following issue: 1) NN sends txn 5 to JN 2) NN gets partitioned from JN while JN remains up. The next two RPCs are missed while the partition has happened: 2a) finalizeSegment(1-5) 2b) startSegment(6) 3) NN sends txn 6 to JN This caused one of the JNs to end up with a segment 1-10 while the others had two segments; 1-5 and 6-10. This broke some invariants of the QJM protocol and prevented the recovery protocol from running properly. This can be addressed on the client side by HDFS-3726, which would cause the NN to not send the RPC in #3. But it makes sense to also add an extra safety check here on the server side: with every journal() call, we can send the segment's txid. Then if the JN and the client get out of sync, the JN can reject the RPCs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3150) Add option for clients to contact DNs via hostname
[ https://issues.apache.org/jira/browse/HDFS-3150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13434632#comment-13434632 ] Hadoop QA commented on HDFS-3150: - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12540836/hdfs-3150.txt against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 10 new or modified test files. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 javadoc. The javadoc tool did not generate any warning messages. +1 eclipse:eclipse. The patch built with eclipse:eclipse. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.server.datanode.TestBlockRecovery org.apache.hadoop.hdfs.server.blockmanagement.TestBlocksWithNotEnoughRacks org.apache.hadoop.hdfs.server.namenode.ha.TestStandbyCheckpoints org.apache.hadoop.hdfs.server.namenode.metrics.TestNameNodeMetrics org.apache.hadoop.hdfs.TestPersistBlocks +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/3004//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/3004//console This message is automatically generated. Add option for clients to contact DNs via hostname -- Key: HDFS-3150 URL: https://issues.apache.org/jira/browse/HDFS-3150 Project: Hadoop HDFS Issue Type: New Feature Components: data-node, hdfs client Affects Versions: 1.0.0, 2.0.0-alpha Reporter: Eli Collins Assignee: Eli Collins Fix For: 1.1.0, 2.2.0-alpha Attachments: hdfs-3150-b1.txt, hdfs-3150-b1.txt, hdfs-3150.txt, hdfs-3150.txt, hdfs-3150.txt, hdfs-3150.txt The DN listens on multiple IP addresses (the default {{dfs.datanode.address}} is the wildcard) however per HADOOP-6867 only the source address (IP) of the registration is given to clients. HADOOP-985 made clients access datanodes by IP primarily to avoid the latency of a DNS lookup, this had the side effect of breaking DN multihoming (the client can not route the IP exposed by the NN if the DN registers with an interface that has a cluster-private IP). To fix this let's add back the option for Datanodes to be accessed by hostname. This can be done by: # Modifying the primary field of the Datanode descriptor to be the hostname, or # Modifying Client/Datanode - Datanode access use the hostname field instead of the IP Approach #2 does not require an incompatible client protocol change, and is much less invasive. It minimizes the scope of modification to just places where clients and Datanodes connect, vs changing all uses of Datanode identifiers. New client and Datanode configuration options are introduced: - {{dfs.client.use.datanode.hostname}} indicates all client to datanode connections should use the datanode hostname (as clients outside cluster may not be able to route the IP) - {{dfs.datanode.use.datanode.hostname}} indicates whether Datanodes should use hostnames when connecting to other Datanodes for data transfer If the configuration options are not used, there is no change in the current behavior. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3800) QJM: improvements to QJM fault testing
[ https://issues.apache.org/jira/browse/HDFS-3800?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13434633#comment-13434633 ] Aaron T. Myers commented on HDFS-3800: -- Test looks great, and I agree we should go ahead and check it in to the branch as-is. One tiny nit: looks like you left in a System.out.println when we should probably have used a LOG.info. +1 otherwise. QJM: improvements to QJM fault testing -- Key: HDFS-3800 URL: https://issues.apache.org/jira/browse/HDFS-3800 Project: Hadoop HDFS Issue Type: Sub-task Affects Versions: QuorumJournalManager (HDFS-3077) Reporter: Todd Lipcon Assignee: Todd Lipcon Attachments: hdfs-3800.txt This JIRA improves TestQJMWithFaults as follows: - the current implementation didn't properly unwrap exceptions thrown by the reflection-based injection method. This caused some issues in the code where the injecting proxy didn't act quite like the original object. - the current implementation incorrectly assumed that the recovery process would recover to _exactly_ the last acked sequence number. In fact, it may recover to that transaction _or any greater transaction_. It also adds a new randomized test which uncovered a number of other bugs. I will defer to the included javadoc for a description of this test. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3788) distcp can't copy large files using webhdfs due to missing Content-Length header
[ https://issues.apache.org/jira/browse/HDFS-3788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13434634#comment-13434634 ] Tsz Wo (Nicholas), SZE commented on HDFS-3788: -- I believe it's legitimate to send a content-length (if known) with a chunked response. ... I believe it is not. See below from http://www.w3.org/Protocols/rfc2616/rfc2616-sec4.html#sec4.4 bq. The Content-Length header field MUST NOT be sent if these two lengths are different (i.e., if a Transfer-Encoding header field is present) I think a test case would be invaluable since the file size issue has reared itself a few times. Could you add a test that uses a mock? I did have a mock test but it requires changing DatanodeWebHdfsMethods. I don't see an easy way to have mock tests without changing the main code. Do you have any idea? If yes, could you add the tests? distcp can't copy large files using webhdfs due to missing Content-Length header Key: HDFS-3788 URL: https://issues.apache.org/jira/browse/HDFS-3788 Project: Hadoop HDFS Issue Type: Bug Components: webhdfs Affects Versions: 0.23.3, 2.0.0-alpha Reporter: Eli Collins Assignee: Tsz Wo (Nicholas), SZE Priority: Critical Attachments: distcp-webhdfs-errors.txt, h3788_20120813.patch, h3788_20120814.patch The following command fails when data1 contains a 3gb file. It passes when using hftp or when the directory just contains smaller (2gb) files, so looks like a webhdfs issue with large files. {{hadoop distcp webhdfs://eli-thinkpad:50070/user/eli/data1 hdfs://localhost:8020/user/eli/data2}} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HDFS-3804) TestHftpFileSystem fails intermittently with JDK7
Trevor Robinson created HDFS-3804: - Summary: TestHftpFileSystem fails intermittently with JDK7 Key: HDFS-3804 URL: https://issues.apache.org/jira/browse/HDFS-3804 Project: Hadoop HDFS Issue Type: Bug Components: test Environment: Apache Maven 3.0.4 Maven home: /usr/share/maven Java version: 1.7.0_04, vendor: Oracle Corporation Java home: /usr/lib/jvm/jdk1.7.0_04/jre Default locale: en_US, platform encoding: ISO-8859-1 OS name: linux, version: 3.2.0-25-generic, arch: amd64, family: unix Reporter: Trevor Robinson For example: testFileNameEncoding(org.apache.hadoop.hdfs.TestHftpFileSystem): Filesystem closed testDataNodeRedirect(org.apache.hadoop.hdfs.TestHftpFileSystem): Filesystem closed This test case sets up a filesystem that is used by the first half of the test methods (in declaration order), but the second half of the tests start by calling {{FileSystem.closeAll}}. With JDK7, test methods are run in an arbitrary order, so if any first half methods run after any second half methods, they fail. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3048) Small race in BlockManager#close
[ https://issues.apache.org/jira/browse/HDFS-3048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13434636#comment-13434636 ] Hadoop QA commented on HDFS-3048: - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12540834/hdfs-3787-2.txt against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 javadoc. The javadoc tool did not generate any warning messages. +1 eclipse:eclipse. The patch built with eclipse:eclipse. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.web.TestWebHdfsWithMultipleNameNodes +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/3006//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/3006//console This message is automatically generated. Small race in BlockManager#close Key: HDFS-3048 URL: https://issues.apache.org/jira/browse/HDFS-3048 Project: Hadoop HDFS Issue Type: Bug Components: name-node Affects Versions: 2.0.0-alpha Reporter: Eli Collins Assignee: Andy Isaacson Attachments: hdfs-3048.txt, hdfs-3787-2.txt There's a small race in BlockManager#close, we close the BlocksMap before the replication monitor, which means the replication monitor can NPE if it tries to access the blocks map. We need to swap the order (close the blocks map after shutting down the repl monitor). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3804) TestHftpFileSystem fails intermittently with JDK7
[ https://issues.apache.org/jira/browse/HDFS-3804?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Trevor Robinson updated HDFS-3804: -- Attachment: HDFS-3804.patch The attached patch splits the test case in two: the tests requiring setup and teardown remain in TestHftpFileSystem.java and the tests that reset the filesystem cache are in TestHftpFileSystemReset.java. TestHftpFileSystem fails intermittently with JDK7 - Key: HDFS-3804 URL: https://issues.apache.org/jira/browse/HDFS-3804 Project: Hadoop HDFS Issue Type: Bug Components: test Environment: Apache Maven 3.0.4 Maven home: /usr/share/maven Java version: 1.7.0_04, vendor: Oracle Corporation Java home: /usr/lib/jvm/jdk1.7.0_04/jre Default locale: en_US, platform encoding: ISO-8859-1 OS name: linux, version: 3.2.0-25-generic, arch: amd64, family: unix Reporter: Trevor Robinson Attachments: HDFS-3804.patch For example: testFileNameEncoding(org.apache.hadoop.hdfs.TestHftpFileSystem): Filesystem closed testDataNodeRedirect(org.apache.hadoop.hdfs.TestHftpFileSystem): Filesystem closed This test case sets up a filesystem that is used by the first half of the test methods (in declaration order), but the second half of the tests start by calling {{FileSystem.closeAll}}. With JDK7, test methods are run in an arbitrary order, so if any first half methods run after any second half methods, they fail. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3804) TestHftpFileSystem fails intermittently with JDK7
[ https://issues.apache.org/jira/browse/HDFS-3804?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Trevor Robinson updated HDFS-3804: -- Assignee: Trevor Robinson Status: Patch Available (was: Open) TestHftpFileSystem fails intermittently with JDK7 - Key: HDFS-3804 URL: https://issues.apache.org/jira/browse/HDFS-3804 Project: Hadoop HDFS Issue Type: Bug Components: test Environment: Apache Maven 3.0.4 Maven home: /usr/share/maven Java version: 1.7.0_04, vendor: Oracle Corporation Java home: /usr/lib/jvm/jdk1.7.0_04/jre Default locale: en_US, platform encoding: ISO-8859-1 OS name: linux, version: 3.2.0-25-generic, arch: amd64, family: unix Reporter: Trevor Robinson Assignee: Trevor Robinson Attachments: HDFS-3804.patch For example: testFileNameEncoding(org.apache.hadoop.hdfs.TestHftpFileSystem): Filesystem closed testDataNodeRedirect(org.apache.hadoop.hdfs.TestHftpFileSystem): Filesystem closed This test case sets up a filesystem that is used by the first half of the test methods (in declaration order), but the second half of the tests start by calling {{FileSystem.closeAll}}. With JDK7, test methods are run in an arbitrary order, so if any first half methods run after any second half methods, they fail. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3723) All commands should support meaningful --help
[ https://issues.apache.org/jira/browse/HDFS-3723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13434656#comment-13434656 ] Suresh Srinivas commented on HDFS-3723: --- Test failure are unrelated to the patch. All commands should support meaningful --help - Key: HDFS-3723 URL: https://issues.apache.org/jira/browse/HDFS-3723 Project: Hadoop HDFS Issue Type: Improvement Components: scripts, tools Affects Versions: 2.0.0-alpha Reporter: E. Sammer Assignee: Jing Zhao Attachments: HDFS-3723.001.patch, HDFS-3723.001.patch, HDFS-3723.002.patch, HDFS-3723.003.patch, HDFS-3723.patch, HDFS-3723.patch Some (sub)commands support -help or -h options for detailed help while others do not. Ideally, all commands should support meaningful help that works regardless of current state or configuration. For example, hdfs zkfc --help (or -h or -help) is not very useful. Option checking should occur before state / configuration checking. {code} [esammer@hadoop-fed01 ~]# hdfs zkfc --help Exception in thread main org.apache.hadoop.HadoopIllegalArgumentException: HA is not enabled for this namenode. at org.apache.hadoop.hdfs.tools.DFSZKFailoverController.setConf(DFSZKFailoverController.java:122) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:66) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84) at org.apache.hadoop.hdfs.tools.DFSZKFailoverController.main(DFSZKFailoverController.java:168) {code} This would go a long way toward better usability for ops staff. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2966) TestNameNodeMetrics tests can fail under load
[ https://issues.apache.org/jira/browse/HDFS-2966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13434660#comment-13434660 ] Trevor Robinson commented on HDFS-2966: --- I hit this on my last 2 builds of trunk. I don't see an open issue on it, so should I create a new issue or reopen this one (or HDFS-540)? {noformat} testCorruptBlock(org.apache.hadoop.hdfs.server.namenode.metrics.TestNameNodeMetrics) Time elapsed: 7.082 sec FAILURE! java.lang.AssertionError: Bad value for metric PendingReplicationBlocks expected:0 but was:1 at org.junit.Assert.fail(Assert.java:91) at org.junit.Assert.failNotEquals(Assert.java:645) at org.junit.Assert.assertEquals(Assert.java:126) at org.junit.Assert.assertEquals(Assert.java:470) at org.apache.hadoop.test.MetricsAsserts.assertGauge(MetricsAsserts.java:191) at org.apache.hadoop.hdfs.server.namenode.metrics.TestNameNodeMetrics.testCorruptBlock(TestNameNodeMetrics.java:186) {noformat} TestNameNodeMetrics tests can fail under load - Key: HDFS-2966 URL: https://issues.apache.org/jira/browse/HDFS-2966 Project: Hadoop HDFS Issue Type: Bug Components: test Affects Versions: 2.0.0-alpha Environment: OS/X running intellij IDEA, firefox, winxp in a virtualbox. Reporter: Steve Loughran Assignee: Steve Loughran Priority: Minor Fix For: 2.2.0-alpha Attachments: HDFS-2966.patch, HDFS-2966.patch, HDFS-2966.patch, HDFS-2966.patch I've managed to recreate HDFS-540 and HDFS-2434 by the simple technique of running the HDFS tests on a desktop with out enough memory for all the programs trying to run. Things got swapped out and the tests failed as the DN heartbeats didn't come in on time. the tests both rely on {{waitForDeletion()}} to block the tests until the delete operation has completed, but all it does is sleep for the same number of seconds as there are datanodes. This is too brittle -it may work on a lightly-loaded system, but not on a system under heavy load where it is taking longer to replicate than expect. Immediate fix: double, triple, the sleep time? Better fix: have the thread block until all the DN heartbeats have finished. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3723) All commands should support meaningful --help
[ https://issues.apache.org/jira/browse/HDFS-3723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Suresh Srinivas updated HDFS-3723: -- Resolution: Fixed Fix Version/s: 3.0.0 Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) I committed the patch. I had to merge some conflict related to import in NameNode.java. Will posted updated file. Thank you Jing for contributing this. All commands should support meaningful --help - Key: HDFS-3723 URL: https://issues.apache.org/jira/browse/HDFS-3723 Project: Hadoop HDFS Issue Type: Improvement Components: scripts, tools Affects Versions: 2.0.0-alpha Reporter: E. Sammer Assignee: Jing Zhao Fix For: 3.0.0 Attachments: HDFS-3723.001.patch, HDFS-3723.001.patch, HDFS-3723.002.patch, HDFS-3723.003.patch, HDFS-3723.patch, HDFS-3723.patch, HDFS-3723.patch Some (sub)commands support -help or -h options for detailed help while others do not. Ideally, all commands should support meaningful help that works regardless of current state or configuration. For example, hdfs zkfc --help (or -h or -help) is not very useful. Option checking should occur before state / configuration checking. {code} [esammer@hadoop-fed01 ~]# hdfs zkfc --help Exception in thread main org.apache.hadoop.HadoopIllegalArgumentException: HA is not enabled for this namenode. at org.apache.hadoop.hdfs.tools.DFSZKFailoverController.setConf(DFSZKFailoverController.java:122) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:66) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84) at org.apache.hadoop.hdfs.tools.DFSZKFailoverController.main(DFSZKFailoverController.java:168) {code} This would go a long way toward better usability for ops staff. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3723) All commands should support meaningful --help
[ https://issues.apache.org/jira/browse/HDFS-3723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Suresh Srinivas updated HDFS-3723: -- Attachment: HDFS-3723.patch Updated patch post merging with the trunk. All commands should support meaningful --help - Key: HDFS-3723 URL: https://issues.apache.org/jira/browse/HDFS-3723 Project: Hadoop HDFS Issue Type: Improvement Components: scripts, tools Affects Versions: 2.0.0-alpha Reporter: E. Sammer Assignee: Jing Zhao Fix For: 3.0.0 Attachments: HDFS-3723.001.patch, HDFS-3723.001.patch, HDFS-3723.002.patch, HDFS-3723.003.patch, HDFS-3723.patch, HDFS-3723.patch, HDFS-3723.patch Some (sub)commands support -help or -h options for detailed help while others do not. Ideally, all commands should support meaningful help that works regardless of current state or configuration. For example, hdfs zkfc --help (or -h or -help) is not very useful. Option checking should occur before state / configuration checking. {code} [esammer@hadoop-fed01 ~]# hdfs zkfc --help Exception in thread main org.apache.hadoop.HadoopIllegalArgumentException: HA is not enabled for this namenode. at org.apache.hadoop.hdfs.tools.DFSZKFailoverController.setConf(DFSZKFailoverController.java:122) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:66) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84) at org.apache.hadoop.hdfs.tools.DFSZKFailoverController.main(DFSZKFailoverController.java:168) {code} This would go a long way toward better usability for ops staff. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3150) Add option for clients to contact DNs via hostname
[ https://issues.apache.org/jira/browse/HDFS-3150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13434673#comment-13434673 ] Hudson commented on HDFS-3150: -- Integrated in Hadoop-Common-trunk-Commit #2577 (See [https://builds.apache.org/job/Hadoop-Common-trunk-Commit/2577/]) HDFS-3150. Add option for clients to contact DNs via hostname. Contributed by Eli Collins (Revision 1373094) Result = SUCCESS eli : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1373094 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/BlockReaderLocal.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSClient.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSInputStream.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSOutputStream.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSUtil.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/DatanodeID.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/ClientDatanodeProtocolTranslatorPB.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DNConf.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataXceiver.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FileChecksumServlets.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/MiniDFSCluster.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSClientRetries.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDistributedFileSystem.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestFileCreation.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestHftpFileSystem.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestMiniDFSCluster.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestShortCircuitLocalRead.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/security/token/block/TestBlockToken.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/DataNodeTestUtils.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/TestInterDatanodeProtocol.java Add option for clients to contact DNs via hostname -- Key: HDFS-3150 URL: https://issues.apache.org/jira/browse/HDFS-3150 Project: Hadoop HDFS Issue Type: New Feature Components: data-node, hdfs client Affects Versions: 1.0.0, 2.0.0-alpha Reporter: Eli Collins Assignee: Eli Collins Fix For: 1.1.0, 2.2.0-alpha Attachments: hdfs-3150-b1.txt, hdfs-3150-b1.txt, hdfs-3150.txt, hdfs-3150.txt, hdfs-3150.txt, hdfs-3150.txt The DN listens on multiple IP addresses (the default {{dfs.datanode.address}} is the wildcard) however per HADOOP-6867 only the source address (IP) of the registration is given to clients. HADOOP-985 made clients access datanodes by IP primarily to avoid the latency of a DNS lookup, this had the side effect of breaking DN multihoming (the client can not route the IP exposed by the NN if the DN registers with an interface that has a cluster-private IP). To fix this let's add back the option for Datanodes to be accessed by hostname. This can be done by: # Modifying the primary field of the Datanode descriptor to be the hostname, or # Modifying Client/Datanode - Datanode access use the hostname field instead of the IP Approach #2 does not require an incompatible client protocol change, and is much less invasive. It minimizes the scope of modification to just places where clients and Datanodes connect, vs changing all uses of
[jira] [Commented] (HDFS-3765) Namenode INITIALIZESHAREDEDITS should be able to initialize all shared storages
[ https://issues.apache.org/jira/browse/HDFS-3765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13434674#comment-13434674 ] Hudson commented on HDFS-3765: -- Integrated in Hadoop-Common-trunk-Commit #2577 (See [https://builds.apache.org/job/Hadoop-Common-trunk-Commit/2577/]) HDFS-3765. namenode -initializeSharedEdits should be able to initialize all shared storages. Contributed by Vinay and Todd Lipcon. (Revision 1373061) Result = SUCCESS todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1373061 Files : * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/ExitUtil.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/contrib/bkjournal/src/test/java/org/apache/hadoop/contrib/bkjournal/TestBookKeeperAsHASharedDir.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNode.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/ha/HATestUtil.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/ha/TestInitializeSharedEdits.java Namenode INITIALIZESHAREDEDITS should be able to initialize all shared storages --- Key: HDFS-3765 URL: https://issues.apache.org/jira/browse/HDFS-3765 Project: Hadoop HDFS Issue Type: Improvement Components: ha Affects Versions: 2.1.0-alpha, 3.0.0 Reporter: Vinay Assignee: Vinay Fix For: 3.0.0, 2.2.0-alpha Attachments: hdfs-3765-branch-2.txt, HDFS-3765.patch, HDFS-3765.patch, HDFS-3765.patch, hdfs-3765.txt, hdfs-3765.txt, hdfs-3765.txt Currently, NameNode INITIALIZESHAREDEDITS provides ability to copy the edits files to file schema based shared storages when moving cluster from Non-HA environment to HA enabled environment. This Jira focuses on the following * Generalizing the logic of copying the edits to new shared storage so that any schema based shared storage can initialized for HA cluster. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3718) Datanode won't shutdown because of runaway DataBlockScanner thread
[ https://issues.apache.org/jira/browse/HDFS-3718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13434675#comment-13434675 ] Hudson commented on HDFS-3718: -- Integrated in Hadoop-Common-trunk-Commit #2577 (See [https://builds.apache.org/job/Hadoop-Common-trunk-Commit/2577/]) HDFS-3718. Datanode won't shutdown because of runaway DataBlockScanner thread (Kihwal Lee via daryn) (Revision 1373090) Result = SUCCESS daryn : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1373090 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java Datanode won't shutdown because of runaway DataBlockScanner thread -- Key: HDFS-3718 URL: https://issues.apache.org/jira/browse/HDFS-3718 Project: Hadoop HDFS Issue Type: Bug Components: data-node Affects Versions: 2.0.1-alpha Reporter: Kihwal Lee Assignee: Kihwal Lee Priority: Critical Fix For: 0.23.3, 2.1.0-alpha, 3.0.0, 2.2.0-alpha Attachments: hdfs-3718.patch.txt Datanode sometimes does not shutdown because the block pool scanner thread keeps running. It prints out Starting a new period every five seconds, even after {{shutdown()}} is called. Somehow the interrupt is missed. {{DataBlockScanner}} will also terminate if {{datanode.shouldRun}} is false, but in {{DataNode#shutdown}}, {{DataBlockScanner#shutdown()}} is invoked before it is being set to false. Is there any reason why {{datanode.shouldRun}} is set to false later? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3150) Add option for clients to contact DNs via hostname
[ https://issues.apache.org/jira/browse/HDFS-3150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13434677#comment-13434677 ] Hudson commented on HDFS-3150: -- Integrated in Hadoop-Hdfs-trunk-Commit #2642 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/2642/]) HDFS-3150. Add option for clients to contact DNs via hostname. Contributed by Eli Collins (Revision 1373094) Result = FAILURE eli : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1373094 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/BlockReaderLocal.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSClient.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSInputStream.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSOutputStream.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSUtil.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/DatanodeID.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/ClientDatanodeProtocolTranslatorPB.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DNConf.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataXceiver.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FileChecksumServlets.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/MiniDFSCluster.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSClientRetries.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDistributedFileSystem.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestFileCreation.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestHftpFileSystem.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestMiniDFSCluster.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestShortCircuitLocalRead.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/security/token/block/TestBlockToken.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/DataNodeTestUtils.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/TestInterDatanodeProtocol.java Add option for clients to contact DNs via hostname -- Key: HDFS-3150 URL: https://issues.apache.org/jira/browse/HDFS-3150 Project: Hadoop HDFS Issue Type: New Feature Components: data-node, hdfs client Affects Versions: 1.0.0, 2.0.0-alpha Reporter: Eli Collins Assignee: Eli Collins Fix For: 1.1.0, 2.2.0-alpha Attachments: hdfs-3150-b1.txt, hdfs-3150-b1.txt, hdfs-3150.txt, hdfs-3150.txt, hdfs-3150.txt, hdfs-3150.txt The DN listens on multiple IP addresses (the default {{dfs.datanode.address}} is the wildcard) however per HADOOP-6867 only the source address (IP) of the registration is given to clients. HADOOP-985 made clients access datanodes by IP primarily to avoid the latency of a DNS lookup, this had the side effect of breaking DN multihoming (the client can not route the IP exposed by the NN if the DN registers with an interface that has a cluster-private IP). To fix this let's add back the option for Datanodes to be accessed by hostname. This can be done by: # Modifying the primary field of the Datanode descriptor to be the hostname, or # Modifying Client/Datanode - Datanode access use the hostname field instead of the IP Approach #2 does not require an incompatible client protocol change, and is much less invasive. It minimizes the scope of modification to just places where clients and Datanodes connect, vs changing all uses of Datanode
[jira] [Commented] (HDFS-3723) All commands should support meaningful --help
[ https://issues.apache.org/jira/browse/HDFS-3723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13434678#comment-13434678 ] Hudson commented on HDFS-3723: -- Integrated in Hadoop-Hdfs-trunk-Commit #2642 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/2642/]) HDFS-3723. Add support -h, -help to all the commands. Contributed by Jing Zhao. (Revision 1373170) Result = FAILURE suresh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1373170 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSUtil.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/Balancer.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNode.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/SecondaryNameNode.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/DFSHAAdmin.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/DFSZKFailoverController.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/DFSck.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/DelegationTokenFetcher.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/GetConf.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/GetGroups.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/tools/TestDFSHAAdmin.java All commands should support meaningful --help - Key: HDFS-3723 URL: https://issues.apache.org/jira/browse/HDFS-3723 Project: Hadoop HDFS Issue Type: Improvement Components: scripts, tools Affects Versions: 2.0.0-alpha Reporter: E. Sammer Assignee: Jing Zhao Fix For: 3.0.0 Attachments: HDFS-3723.001.patch, HDFS-3723.001.patch, HDFS-3723.002.patch, HDFS-3723.003.patch, HDFS-3723.patch, HDFS-3723.patch, HDFS-3723.patch Some (sub)commands support -help or -h options for detailed help while others do not. Ideally, all commands should support meaningful help that works regardless of current state or configuration. For example, hdfs zkfc --help (or -h or -help) is not very useful. Option checking should occur before state / configuration checking. {code} [esammer@hadoop-fed01 ~]# hdfs zkfc --help Exception in thread main org.apache.hadoop.HadoopIllegalArgumentException: HA is not enabled for this namenode. at org.apache.hadoop.hdfs.tools.DFSZKFailoverController.setConf(DFSZKFailoverController.java:122) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:66) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84) at org.apache.hadoop.hdfs.tools.DFSZKFailoverController.main(DFSZKFailoverController.java:168) {code} This would go a long way toward better usability for ops staff. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3765) Namenode INITIALIZESHAREDEDITS should be able to initialize all shared storages
[ https://issues.apache.org/jira/browse/HDFS-3765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13434679#comment-13434679 ] Hudson commented on HDFS-3765: -- Integrated in Hadoop-Hdfs-trunk-Commit #2642 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/2642/]) HDFS-3765. namenode -initializeSharedEdits should be able to initialize all shared storages. Contributed by Vinay and Todd Lipcon. (Revision 1373061) Result = FAILURE todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1373061 Files : * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/ExitUtil.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/contrib/bkjournal/src/test/java/org/apache/hadoop/contrib/bkjournal/TestBookKeeperAsHASharedDir.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNode.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/ha/HATestUtil.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/ha/TestInitializeSharedEdits.java Namenode INITIALIZESHAREDEDITS should be able to initialize all shared storages --- Key: HDFS-3765 URL: https://issues.apache.org/jira/browse/HDFS-3765 Project: Hadoop HDFS Issue Type: Improvement Components: ha Affects Versions: 2.1.0-alpha, 3.0.0 Reporter: Vinay Assignee: Vinay Fix For: 3.0.0, 2.2.0-alpha Attachments: hdfs-3765-branch-2.txt, HDFS-3765.patch, HDFS-3765.patch, HDFS-3765.patch, hdfs-3765.txt, hdfs-3765.txt, hdfs-3765.txt Currently, NameNode INITIALIZESHAREDEDITS provides ability to copy the edits files to file schema based shared storages when moving cluster from Non-HA environment to HA enabled environment. This Jira focuses on the following * Generalizing the logic of copying the edits to new shared storage so that any schema based shared storage can initialized for HA cluster. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3718) Datanode won't shutdown because of runaway DataBlockScanner thread
[ https://issues.apache.org/jira/browse/HDFS-3718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13434680#comment-13434680 ] Hudson commented on HDFS-3718: -- Integrated in Hadoop-Hdfs-trunk-Commit #2642 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/2642/]) HDFS-3718. Datanode won't shutdown because of runaway DataBlockScanner thread (Kihwal Lee via daryn) (Revision 1373090) Result = FAILURE daryn : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1373090 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java Datanode won't shutdown because of runaway DataBlockScanner thread -- Key: HDFS-3718 URL: https://issues.apache.org/jira/browse/HDFS-3718 Project: Hadoop HDFS Issue Type: Bug Components: data-node Affects Versions: 2.0.1-alpha Reporter: Kihwal Lee Assignee: Kihwal Lee Priority: Critical Fix For: 0.23.3, 2.1.0-alpha, 3.0.0, 2.2.0-alpha Attachments: hdfs-3718.patch.txt Datanode sometimes does not shutdown because the block pool scanner thread keeps running. It prints out Starting a new period every five seconds, even after {{shutdown()}} is called. Somehow the interrupt is missed. {{DataBlockScanner}} will also terminate if {{datanode.shouldRun}} is false, but in {{DataNode#shutdown}}, {{DataBlockScanner#shutdown()}} is invoked before it is being set to false. Is there any reason why {{datanode.shouldRun}} is set to false later? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3803) BlockPoolSliceScanner new work period notice is very chatty at INFO level
[ https://issues.apache.org/jira/browse/HDFS-3803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13434690#comment-13434690 ] Suresh Srinivas commented on HDFS-3803: --- +1 for the patch. I will commit it soon. BlockPoolSliceScanner new work period notice is very chatty at INFO level - Key: HDFS-3803 URL: https://issues.apache.org/jira/browse/HDFS-3803 Project: Hadoop HDFS Issue Type: Bug Components: data-node Affects Versions: 2.1.0-alpha, 2.0.1-alpha Environment: Hadoop 2.0.1-alpha-SNAPSHOT Reporter: Andrew Purtell Priority: Minor Attachments: HDFS-3803.patch One line of ~140 chars logged every 5 seconds. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3802) StartupOption.name in HdfsServerConstants should be final
[ https://issues.apache.org/jira/browse/HDFS-3802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jing Zhao updated HDFS-3802: Assignee: Jing Zhao Status: Patch Available (was: Open) StartupOption.name in HdfsServerConstants should be final - Key: HDFS-3802 URL: https://issues.apache.org/jira/browse/HDFS-3802 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 3.0.0 Reporter: Jing Zhao Assignee: Jing Zhao Priority: Trivial Fix For: 3.0.0 Attachments: HDFS-3802.patch In HdfsServerConstants, it may be better to define StartupOption.name as final since it will not and should not be modified after initialization. For example, in NameNode.java, the printUsage function prints out multiple startup options' names. The modification/change of the StartupOption.name may cause invalid usage message. Although right now there is no methods to change/set the value of StartupOption.name, it is better to add the final keyword to make sure. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3723) All commands should support meaningful --help
[ https://issues.apache.org/jira/browse/HDFS-3723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Suresh Srinivas updated HDFS-3723: -- Attachment: HDFS-3723.patch Attaching the complete patch. All commands should support meaningful --help - Key: HDFS-3723 URL: https://issues.apache.org/jira/browse/HDFS-3723 Project: Hadoop HDFS Issue Type: Improvement Components: scripts, tools Affects Versions: 2.0.0-alpha Reporter: E. Sammer Assignee: Jing Zhao Fix For: 3.0.0 Attachments: HDFS-3723.001.patch, HDFS-3723.001.patch, HDFS-3723.002.patch, HDFS-3723.003.patch, HDFS-3723.patch, HDFS-3723.patch, HDFS-3723.patch Some (sub)commands support -help or -h options for detailed help while others do not. Ideally, all commands should support meaningful help that works regardless of current state or configuration. For example, hdfs zkfc --help (or -h or -help) is not very useful. Option checking should occur before state / configuration checking. {code} [esammer@hadoop-fed01 ~]# hdfs zkfc --help Exception in thread main org.apache.hadoop.HadoopIllegalArgumentException: HA is not enabled for this namenode. at org.apache.hadoop.hdfs.tools.DFSZKFailoverController.setConf(DFSZKFailoverController.java:122) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:66) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84) at org.apache.hadoop.hdfs.tools.DFSZKFailoverController.main(DFSZKFailoverController.java:168) {code} This would go a long way toward better usability for ops staff. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3723) All commands should support meaningful --help
[ https://issues.apache.org/jira/browse/HDFS-3723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Suresh Srinivas updated HDFS-3723: -- Attachment: (was: HDFS-3723.patch) All commands should support meaningful --help - Key: HDFS-3723 URL: https://issues.apache.org/jira/browse/HDFS-3723 Project: Hadoop HDFS Issue Type: Improvement Components: scripts, tools Affects Versions: 2.0.0-alpha Reporter: E. Sammer Assignee: Jing Zhao Fix For: 3.0.0 Attachments: HDFS-3723.001.patch, HDFS-3723.001.patch, HDFS-3723.002.patch, HDFS-3723.003.patch, HDFS-3723.patch, HDFS-3723.patch, HDFS-3723.patch Some (sub)commands support -help or -h options for detailed help while others do not. Ideally, all commands should support meaningful help that works regardless of current state or configuration. For example, hdfs zkfc --help (or -h or -help) is not very useful. Option checking should occur before state / configuration checking. {code} [esammer@hadoop-fed01 ~]# hdfs zkfc --help Exception in thread main org.apache.hadoop.HadoopIllegalArgumentException: HA is not enabled for this namenode. at org.apache.hadoop.hdfs.tools.DFSZKFailoverController.setConf(DFSZKFailoverController.java:122) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:66) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84) at org.apache.hadoop.hdfs.tools.DFSZKFailoverController.main(DFSZKFailoverController.java:168) {code} This would go a long way toward better usability for ops staff. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HDFS-3803) BlockPoolSliceScanner new work period notice is very chatty at INFO level
[ https://issues.apache.org/jira/browse/HDFS-3803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Suresh Srinivas resolved HDFS-3803. --- Resolution: Fixed Fix Version/s: 3.0.0 Hadoop Flags: Reviewed I committed the patch. Thank you Andrew. BlockPoolSliceScanner new work period notice is very chatty at INFO level - Key: HDFS-3803 URL: https://issues.apache.org/jira/browse/HDFS-3803 Project: Hadoop HDFS Issue Type: Bug Components: data-node Affects Versions: 2.1.0-alpha, 2.0.1-alpha Environment: Hadoop 2.0.1-alpha-SNAPSHOT Reporter: Andrew Purtell Priority: Minor Fix For: 3.0.0 Attachments: HDFS-3803.patch One line of ~140 chars logged every 5 seconds. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3672) Expose disk-location information for blocks to enable better scheduling
[ https://issues.apache.org/jira/browse/HDFS-3672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13434699#comment-13434699 ] Andrew Wang commented on HDFS-3672: --- Ran TestFsck locally and it passed, test failures I think are unrelated. Expose disk-location information for blocks to enable better scheduling --- Key: HDFS-3672 URL: https://issues.apache.org/jira/browse/HDFS-3672 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 2.0.0-alpha Reporter: Andrew Wang Assignee: Andrew Wang Attachments: design-doc-v1.pdf, design-doc-v2.pdf, hdfs-3672-1.patch, hdfs-3672-2.patch, hdfs-3672-3.patch, hdfs-3672-4.patch, hdfs-3672-5.patch, hdfs-3672-6.patch, hdfs-3672-7.patch, hdfs-3672-8.patch, hdfs-3672-9.patch Currently, HDFS exposes on which datanodes a block resides, which allows clients to make scheduling decisions for locality and load balancing. Extending this to also expose on which disk on a datanode a block resides would enable even better scheduling, on a per-disk rather than coarse per-datanode basis. This API would likely look similar to Filesystem#getFileBlockLocations, but also involve a series of RPCs to the responsible datanodes to determine disk ids. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3793) Implement genericized format() in QJM
[ https://issues.apache.org/jira/browse/HDFS-3793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13434702#comment-13434702 ] Todd Lipcon commented on HDFS-3793: --- Fixed the two nits and committed to branch. Thanks. bq. Seems like you should make the QuorumJournalManager#format and QuroumJournalManager#hasSomeData timeouts configurable, or at least use constants and add a comment or two justifying how you chose those values. I added constants and set them both to 60sec. Also added a comment explaining that, since they are only used in format and not normal operation, we can use a fairly long timeout and don't really need to configure them (if a user sees a timeout they can manually investigate why it's taking 60+sec and do something about it) bq. I think I see the reasoning behind the need for the call to unlockAll in JNStorage#format, but you might want to add a comment explaining why it's necessary. Also, if this happens, when will the storage be locked again? Might want to add a comment explaining that as well. Added a comment: {code} // Unlock the directory before formatting, because we will // re-analyze it after format(). The analyzeStorage() call // below is reponsible for re-locking it. This is a no-op // if the storage is not currently locked. unlockAll(); {code} Implement genericized format() in QJM - Key: HDFS-3793 URL: https://issues.apache.org/jira/browse/HDFS-3793 Project: Hadoop HDFS Issue Type: Sub-task Components: ha Affects Versions: QuorumJournalManager (HDFS-3077) Reporter: Todd Lipcon Assignee: Todd Lipcon Attachments: hdfs-3793.txt HDFS-3695 added the ability for non-File journal managers to tie into calls like NameNode -format. This JIRA is to implement format() for QuorumJournalManager. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3803) BlockPoolSliceScanner new work period notice is very chatty at INFO level
[ https://issues.apache.org/jira/browse/HDFS-3803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13434703#comment-13434703 ] Andy Isaacson commented on HDFS-3803: - The BlockPoolScanner is supposed to be starting a new period every three weeks, not every 5 seconds. See HDFS-3194. I think this -LOG.info +LOG.debug change should be reverted and https://issues.apache.org/jira/browse/HDFS-3194?focusedCommentId=13399085page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13399085 should be merged instead. BlockPoolSliceScanner new work period notice is very chatty at INFO level - Key: HDFS-3803 URL: https://issues.apache.org/jira/browse/HDFS-3803 Project: Hadoop HDFS Issue Type: Bug Components: data-node Affects Versions: 2.1.0-alpha, 2.0.1-alpha Environment: Hadoop 2.0.1-alpha-SNAPSHOT Reporter: Andrew Purtell Priority: Minor Fix For: 3.0.0 Attachments: HDFS-3803.patch One line of ~140 chars logged every 5 seconds. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HDFS-3793) Implement genericized format() in QJM
[ https://issues.apache.org/jira/browse/HDFS-3793?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Todd Lipcon resolved HDFS-3793. --- Resolution: Fixed Fix Version/s: QuorumJournalManager (HDFS-3077) Hadoop Flags: Reviewed Implement genericized format() in QJM - Key: HDFS-3793 URL: https://issues.apache.org/jira/browse/HDFS-3793 Project: Hadoop HDFS Issue Type: Sub-task Components: ha Affects Versions: QuorumJournalManager (HDFS-3077) Reporter: Todd Lipcon Assignee: Todd Lipcon Fix For: QuorumJournalManager (HDFS-3077) Attachments: hdfs-3793.txt HDFS-3695 added the ability for non-File journal managers to tie into calls like NameNode -format. This JIRA is to implement format() for QuorumJournalManager. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HDFS-3798) Avoid throwing NPE when finalizeSegment() is called on invalid segment
[ https://issues.apache.org/jira/browse/HDFS-3798?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Todd Lipcon resolved HDFS-3798. --- Resolution: Fixed Fix Version/s: QuorumJournalManager (HDFS-3077) Hadoop Flags: Reviewed Avoid throwing NPE when finalizeSegment() is called on invalid segment -- Key: HDFS-3798 URL: https://issues.apache.org/jira/browse/HDFS-3798 Project: Hadoop HDFS Issue Type: Sub-task Components: ha Affects Versions: QuorumJournalManager (HDFS-3077) Reporter: Todd Lipcon Assignee: Todd Lipcon Priority: Trivial Fix For: QuorumJournalManager (HDFS-3077) Attachments: hdfs-3798.txt, hdfs-3798.txt Currently, if the client calls finalizeLogSegment() on a segment which doesn't exist on the JournalNode side, it throws an NPE. Instead it should throw a more intelligible exception. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3799) QJM: handle empty log segments during recovery
[ https://issues.apache.org/jira/browse/HDFS-3799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13434709#comment-13434709 ] Todd Lipcon commented on HDFS-3799: --- Fixed the spelling typo. Going to punt on the other thing - the different loop iterations fail separately enough that it's easier to diagnose them as separate test cases. Will commit momentarily with the nit addressed. QJM: handle empty log segments during recovery -- Key: HDFS-3799 URL: https://issues.apache.org/jira/browse/HDFS-3799 Project: Hadoop HDFS Issue Type: Sub-task Components: ha Affects Versions: QuorumJournalManager (HDFS-3077) Reporter: Todd Lipcon Assignee: Todd Lipcon Fix For: QuorumJournalManager (HDFS-3077) Attachments: hdfs-3799.txt One of the cases not yet handled in the QJM branch is the one where either the writer or the journal node crashes after startLogSegment() but before it has written its first transaction to the log. We currently have TODO assertions in the code which fire in these cases. This JIRA is to deal with these cases. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3803) BlockPoolSliceScanner new work period notice is very chatty at INFO level
[ https://issues.apache.org/jira/browse/HDFS-3803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Suresh Srinivas updated HDFS-3803: -- Fix Version/s: 2.1.0-alpha Committed to 2.1 BlockPoolSliceScanner new work period notice is very chatty at INFO level - Key: HDFS-3803 URL: https://issues.apache.org/jira/browse/HDFS-3803 Project: Hadoop HDFS Issue Type: Bug Components: data-node Affects Versions: 2.1.0-alpha, 2.0.1-alpha Environment: Hadoop 2.0.1-alpha-SNAPSHOT Reporter: Andrew Purtell Priority: Minor Fix For: 2.1.0-alpha, 3.0.0 Attachments: HDFS-3803.patch One line of ~140 chars logged every 5 seconds. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira