[jira] [Commented] (HDFS-7683) Combine usages and percent stats in NameNode UI
[ https://issues.apache.org/jira/browse/HDFS-7683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14293480#comment-14293480 ] Hadoop QA commented on HDFS-7683: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12694757/HDFS-7683-001.patch against trunk revision 6f9fe76. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-hdfs-project/hadoop-hdfs. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/9344//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9344//console This message is automatically generated. Combine usages and percent stats in NameNode UI --- Key: HDFS-7683 URL: https://issues.apache.org/jira/browse/HDFS-7683 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Reporter: Vinayakumar B Assignee: Vinayakumar B Priority: Minor Attachments: 7683-snapshot.jpg, HDFS-7683-001.patch, HDFS-7683-001.patch In NameNode UI, there are separate rows to display cluster usage, one is in bytes, another one is in percentage. We can combine these two rows to just display percent usage in brackets. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7630) TestConnCache hardcode block size without considering native OS
[ https://issues.apache.org/jira/browse/HDFS-7630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14293539#comment-14293539 ] sam liu commented on HDFS-7630: --- Hi Arpit, According to your comments, I combined all fixes into HDFS-7630.002.patch. All the unit tests passed excepting TestBlockScanner, however the failure is not caused by current patch. Could you please help take a review? Thanks! TestConnCache hardcode block size without considering native OS --- Key: HDFS-7630 URL: https://issues.apache.org/jira/browse/HDFS-7630 Project: Hadoop HDFS Issue Type: Bug Components: test Reporter: sam liu Assignee: sam liu Attachments: HDFS-7630.001.patch, HDFS-7630.002.patch TestConnCache hardcode block size with 'BLOCK_SIZE = 4096', however it's incorrect on some platforms. For example, on power platform, the correct value is 65536. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7423) various typos and message formatting fixes in nfs daemon and doc
[ https://issues.apache.org/jira/browse/HDFS-7423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14293543#comment-14293543 ] Charles Lamb commented on HDFS-7423: Thank you for the review [~ste...@apache.org]. various typos and message formatting fixes in nfs daemon and doc Key: HDFS-7423 URL: https://issues.apache.org/jira/browse/HDFS-7423 Project: Hadoop HDFS Issue Type: Bug Components: nfs Affects Versions: 2.7.0 Reporter: Charles Lamb Assignee: Charles Lamb Priority: Trivial Attachments: HDFS-7423.001.patch, HDFS-7423.002.patch, HDFS-7423.003.patch These are accumulated fixes for log messages, formatting, typos, etc. in the nfs3 daemon that I came across while working on a customer issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7611) deleteSnapshot and delete of a file can leave orphaned blocks in the blocksMap on NameNode restart.
[ https://issues.apache.org/jira/browse/HDFS-7611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14294095#comment-14294095 ] Konstantin Shvachko commented on HDFS-7611: --- The patch looks good. And it fixes Byron's test case. One thing that I worry about is that your previous patch did not fix {{TestOpenFilesWithSnapshot}}, which still timed out. When I was looking at {{TestFileTruncate}} its failure looked similar to the failure of {{TestOpenFilesWithSnapshot}}. So either we missed some other case or {{TestOpenFilesWithSnapshot}} has a different problem. deleteSnapshot and delete of a file can leave orphaned blocks in the blocksMap on NameNode restart. --- Key: HDFS-7611 URL: https://issues.apache.org/jira/browse/HDFS-7611 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.6.0 Reporter: Konstantin Shvachko Assignee: Byron Wong Priority: Critical Attachments: HDFS-7611.000.patch, HDFS-7611.001.patch, HDFS-7611.002.patch, blocksNotDeletedTest.patch, testTruncateEditLogLoad.log If quotas are enabled a combination of operations *deleteSnapshot* and *delete* of a file can leave orphaned blocks in the blocksMap on NameNode restart. They are counted as missing on the NameNode, and can prevent NameNode from coming out of safeMode and could cause memory leak during startup. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7683) Combine usages and percent stats in NameNode UI
[ https://issues.apache.org/jira/browse/HDFS-7683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haohui Mai updated HDFS-7683: - Resolution: Fixed Fix Version/s: 2.7.0 Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) I've committed the patch to trunk and branch-2. Thanks [~vinayrpet] for the contribution. Combine usages and percent stats in NameNode UI --- Key: HDFS-7683 URL: https://issues.apache.org/jira/browse/HDFS-7683 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Reporter: Vinayakumar B Assignee: Vinayakumar B Priority: Minor Fix For: 2.7.0 Attachments: 7683-snapshot.jpg, HDFS-7683-001.patch, HDFS-7683-001.patch In NameNode UI, there are separate rows to display cluster usage, one is in bytes, another one is in percentage. We can combine these two rows to just display percent usage in brackets. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7675) Unused member DFSClient.spanReceiverHost
[ https://issues.apache.org/jira/browse/HDFS-7675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14294005#comment-14294005 ] Colin Patrick McCabe commented on HDFS-7675: Currently, we don't reference count our {{SpanReceivers}}. So if a single {{DFSClient}} exited and called {{closeReceivers}}, it could close things that other {{DFSClients}} were using. I would be a fine with a patch adding reference counts to each {{SpanReceiver}} here. I don't think it will be that useful to many HDFS clients (maybe no production ones?), but it's more satisfying to have us close things when the {{DFSClient}} is closed. Unused member DFSClient.spanReceiverHost Key: HDFS-7675 URL: https://issues.apache.org/jira/browse/HDFS-7675 Project: Hadoop HDFS Issue Type: Bug Components: dfsclient Affects Versions: 3.0.0 Reporter: Konstantin Shvachko DFSClient.spanReceiverHost is initialised but never used. Could be redundant. This was introduced by HDFS-7055. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-5796) The file system browser in the namenode UI requires SPNEGO.
[ https://issues.apache.org/jira/browse/HDFS-5796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14293957#comment-14293957 ] Haohui Mai commented on HDFS-5796: -- bq. Would that be OK to place back as a feature (turned off by default if needed), as the new file browser has regressed? Is giving every files the world-readable permission a possible workaround? It looks like the workaround is fully equivalent to using dr. who here? The file system browser in the namenode UI requires SPNEGO. --- Key: HDFS-5796 URL: https://issues.apache.org/jira/browse/HDFS-5796 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.5.0 Reporter: Kihwal Lee Assignee: Arun Suresh Attachments: HDFS-5796.1.patch, HDFS-5796.1.patch, HDFS-5796.2.patch, HDFS-5796.3.patch, HDFS-5796.3.patch After HDFS-5382, the browser makes webhdfs REST calls directly, requiring SPNEGO to work between user's browser and namenode. This won't work if the cluster's security infrastructure is isolated from the regular network. Moreover, SPNEGO is not supposed to be required for user-facing web pages. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-3689) Add support for variable length block
[ https://issues.apache.org/jira/browse/HDFS-3689?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo Nicholas Sze updated HDFS-3689: -- Hadoop Flags: Reviewed +1 patch looks good. Add support for variable length block - Key: HDFS-3689 URL: https://issues.apache.org/jira/browse/HDFS-3689 Project: Hadoop HDFS Issue Type: New Feature Components: datanode, hdfs-client, namenode Affects Versions: 3.0.0 Reporter: Suresh Srinivas Assignee: Jing Zhao Attachments: HDFS-3689.000.patch, HDFS-3689.001.patch, HDFS-3689.002.patch, HDFS-3689.003.patch, HDFS-3689.003.patch, HDFS-3689.004.patch, HDFS-3689.005.patch, HDFS-3689.006.patch, HDFS-3689.007.patch, HDFS-3689.008.patch, HDFS-3689.008.patch, HDFS-3689.009.patch, HDFS-3689.009.patch, HDFS-3689.010.patch, editsStored Currently HDFS supports fixed length blocks. Supporting variable length block will allow new use cases and features to be built on top of HDFS. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7683) Combine usages and percent stats in NameNode UI
[ https://issues.apache.org/jira/browse/HDFS-7683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14294001#comment-14294001 ] Hudson commented on HDFS-7683: -- FAILURE: Integrated in Hadoop-trunk-Commit #6939 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/6939/]) HDFS-7683. Combine usages and percent stats in NameNode UI. Contributed by Vinayakumar B. (wheat9: rev 1e2d98a394d98f9f1b6791cbe9cef474c19b8ceb) * hadoop-hdfs-project/hadoop-hdfs/src/main/webapps/hdfs/dfshealth.html * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt Combine usages and percent stats in NameNode UI --- Key: HDFS-7683 URL: https://issues.apache.org/jira/browse/HDFS-7683 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Reporter: Vinayakumar B Assignee: Vinayakumar B Priority: Minor Fix For: 2.7.0 Attachments: 7683-snapshot.jpg, HDFS-7683-001.patch, HDFS-7683-001.patch In NameNode UI, there are separate rows to display cluster usage, one is in bytes, another one is in percentage. We can combine these two rows to just display percent usage in brackets. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7677) DistributedFileSystem#truncate should resolve symlinks
[ https://issues.apache.org/jira/browse/HDFS-7677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14293996#comment-14293996 ] Konstantin Shvachko commented on HDFS-7677: --- +1 from me too. Don't think this could do anything to eclipse:eclipse. DistributedFileSystem#truncate should resolve symlinks -- Key: HDFS-7677 URL: https://issues.apache.org/jira/browse/HDFS-7677 Project: Hadoop HDFS Issue Type: Sub-task Components: datanode, namenode Reporter: Yi Liu Assignee: Yi Liu Attachments: HDFS-7677.001.patch, HDFS-7677.002.patch We should resolve the symlinks in DistributedFileSystem#truncate as we do for {{create}}, {{open}}, {{append}} and so on, I don't see any reason not support it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7411) Refactor and improve decommissioning logic into DecommissionManager
[ https://issues.apache.org/jira/browse/HDFS-7411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14294037#comment-14294037 ] Andrew Wang commented on HDFS-7411: --- Looks like a flake, balancer. Refactor and improve decommissioning logic into DecommissionManager --- Key: HDFS-7411 URL: https://issues.apache.org/jira/browse/HDFS-7411 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 2.5.1 Reporter: Andrew Wang Assignee: Andrew Wang Attachments: hdfs-7411.001.patch, hdfs-7411.002.patch, hdfs-7411.003.patch, hdfs-7411.004.patch, hdfs-7411.005.patch, hdfs-7411.006.patch, hdfs-7411.007.patch, hdfs-7411.008.patch, hdfs-7411.009.patch Would be nice to split out decommission logic from DatanodeManager to DecommissionManager. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-7686) Corrupt block reporting to namenode soon feature is overwritten by HDFS-7430
Rushabh S Shah created HDFS-7686: Summary: Corrupt block reporting to namenode soon feature is overwritten by HDFS-7430 Key: HDFS-7686 URL: https://issues.apache.org/jira/browse/HDFS-7686 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 3.0.0 Reporter: Rushabh S Shah The feature implemented in HDFS-7548 is removed by HDFS-7430. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7353) Raw Erasure Coder API for concrete encoding and decoding
[ https://issues.apache.org/jira/browse/HDFS-7353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14293893#comment-14293893 ] Zhe Zhang commented on HDFS-7353: - Weird eclipse error. Thanks [~clamb] for providing the following helpful advice offline. [~drankye] I guess you'll take care of it? bq. You could try either resubmitting it. or doing the command yourself to see what happens: /home/jenkins/tools/maven/latest/bin/mvn eclipse:eclipse -DHadoopPatchProcess /home/jenkins/jenkins-slave/workspace/PreCommit-HDFS-Build/../patchprocess/patchEclipseOutput.txt 21 bq. I think I've seen similar and it tends to be a jenkins error more than anything else. Raw Erasure Coder API for concrete encoding and decoding Key: HDFS-7353 URL: https://issues.apache.org/jira/browse/HDFS-7353 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Kai Zheng Assignee: Kai Zheng Fix For: HDFS-EC Attachments: HDFS-7353-v1.patch, HDFS-7353-v2.patch, HDFS-7353-v3.patch, HDFS-7353-v4.patch, HDFS-7353-v5.patch, HDFS-7353-v6.patch, HDFS-7353-v7.patch This is to abstract and define raw erasure coder API across different codes algorithms like RS, XOR and etc. Such API can be implemented by utilizing various library support, such as Intel ISA library and Jerasure library. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7611) deleteSnapshot and delete of a file can leave orphaned blocks in the blocksMap on NameNode restart.
[ https://issues.apache.org/jira/browse/HDFS-7611?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jing Zhao updated HDFS-7611: Attachment: HDFS-7611.002.patch Thanks for the review, [~shv]! Update the patch to address your comments. deleteSnapshot and delete of a file can leave orphaned blocks in the blocksMap on NameNode restart. --- Key: HDFS-7611 URL: https://issues.apache.org/jira/browse/HDFS-7611 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.6.0 Reporter: Konstantin Shvachko Assignee: Byron Wong Priority: Critical Attachments: HDFS-7611.000.patch, HDFS-7611.001.patch, HDFS-7611.002.patch, blocksNotDeletedTest.patch, testTruncateEditLogLoad.log If quotas are enabled a combination of operations *deleteSnapshot* and *delete* of a file can leave orphaned blocks in the blocksMap on NameNode restart. They are counted as missing on the NameNode, and can prevent NameNode from coming out of safeMode and could cause memory leak during startup. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7686) Corrupt block reporting to namenode soon feature is overwritten by HDFS-7430
[ https://issues.apache.org/jira/browse/HDFS-7686?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14294117#comment-14294117 ] Rushabh S Shah commented on HDFS-7686: -- The feature implemented by HDFS-7548 was removed by HDFS-7430. Corrupt block reporting to namenode soon feature is overwritten by HDFS-7430 --- Key: HDFS-7686 URL: https://issues.apache.org/jira/browse/HDFS-7686 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 3.0.0 Reporter: Rushabh S Shah The feature implemented in HDFS-7548 is removed by HDFS-7430. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7683) Combine usages and percent stats in NameNode UI
[ https://issues.apache.org/jira/browse/HDFS-7683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14293991#comment-14293991 ] Haohui Mai commented on HDFS-7683: -- +1. I'll commit it shortly. Combine usages and percent stats in NameNode UI --- Key: HDFS-7683 URL: https://issues.apache.org/jira/browse/HDFS-7683 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Reporter: Vinayakumar B Assignee: Vinayakumar B Priority: Minor Attachments: 7683-snapshot.jpg, HDFS-7683-001.patch, HDFS-7683-001.patch In NameNode UI, there are separate rows to display cluster usage, one is in bytes, another one is in percentage. We can combine these two rows to just display percent usage in brackets. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-5420) Not need to launch Secondary namenode for NN HA mode?
[ https://issues.apache.org/jira/browse/HDFS-5420?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allen Wittenauer updated HDFS-5420: --- Attachment: HDFS-5420-00.patch -00: * Initial version I think this covers some cases, but not all of them. Better than nothing! Not need to launch Secondary namenode for NN HA mode? - Key: HDFS-5420 URL: https://issues.apache.org/jira/browse/HDFS-5420 Project: Hadoop HDFS Issue Type: Bug Components: scripts Affects Versions: 3.0.0 Reporter: Raymond Liu Priority: Minor Attachments: HDFS-5420-00.patch For Hadoop 2, When deploying with NN HA, the wiki says that it is an error to start a secondary namenode. While the sbin/start-dfs.sh still launch a secondary namenode even nothing related to the secondary namenode is configured. Should this be fixed? or people just don't use this scripts to start HA hdfs? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-5420) Not need to launch Secondary namenode for NN HA mode?
[ https://issues.apache.org/jira/browse/HDFS-5420?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allen Wittenauer updated HDFS-5420: --- Affects Version/s: 3.0.0 Status: Patch Available (was: Open) Not need to launch Secondary namenode for NN HA mode? - Key: HDFS-5420 URL: https://issues.apache.org/jira/browse/HDFS-5420 Project: Hadoop HDFS Issue Type: Bug Components: scripts Affects Versions: 3.0.0 Reporter: Raymond Liu Priority: Minor Attachments: HDFS-5420-00.patch For Hadoop 2, When deploying with NN HA, the wiki says that it is an error to start a secondary namenode. While the sbin/start-dfs.sh still launch a secondary namenode even nothing related to the secondary namenode is configured. Should this be fixed? or people just don't use this scripts to start HA hdfs? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-5796) The file system browser in the namenode UI requires SPNEGO.
[ https://issues.apache.org/jira/browse/HDFS-5796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14294023#comment-14294023 ] Arun Suresh commented on HDFS-5796: --- [~wheat9], [~qwertymaniac], Thank you both for chiming in.. bq. Is giving every files the world-readable permission a possible workaround? It looks like the workaround is fully equivalent to using dr. who here? The latest patch allows the administrator to configure a real user as the browser proxy (not static dr.who). Since this would be a real valid user, hdfs admin can apply normal access grants / restrictions on this user... and thus wont be world readable like dr.who The file system browser in the namenode UI requires SPNEGO. --- Key: HDFS-5796 URL: https://issues.apache.org/jira/browse/HDFS-5796 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.5.0 Reporter: Kihwal Lee Assignee: Arun Suresh Attachments: HDFS-5796.1.patch, HDFS-5796.1.patch, HDFS-5796.2.patch, HDFS-5796.3.patch, HDFS-5796.3.patch After HDFS-5382, the browser makes webhdfs REST calls directly, requiring SPNEGO to work between user's browser and namenode. This won't work if the cluster's security infrastructure is isolated from the regular network. Moreover, SPNEGO is not supposed to be required for user-facing web pages. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7611) deleteSnapshot and delete of a file can leave orphaned blocks in the blocksMap on NameNode restart.
[ https://issues.apache.org/jira/browse/HDFS-7611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14293184#comment-14293184 ] Konstantin Shvachko commented on HDFS-7611: --- I like option 2. Minor comments. # {{updateQuota()}} can be moved to {{FSDirectory}}. That way you can also reuse it in {{unprotectedDelete()}}. And call it something like {{updateQuotaIfNeeded()}} # _@return 0 otherwise_ is formally correct in {{removeLastINode()}}, but it actually now returns only -1. 0, or 1. # No need to import {{Assert}}, just use {{assertEquals()}} directly as it is already imported. This will also shorten last line to 80 symbols. # It would be good to add a log message to {{MiniDFSCluster.waitClusterUp()}}. Right now it only throws IOException when it times out. But that is not reflected in the logs. Adding something like {code} if (++i 10) { - throw new IOException(Timed out waiting for Mini HDFS Cluster to start); + String msg = Timed out waiting for Mini HDFS Cluster to start; + LOG.error(msg); + throw new IOException(msg); } {code} was helpful in debugging the problem. deleteSnapshot and delete of a file can leave orphaned blocks in the blocksMap on NameNode restart. --- Key: HDFS-7611 URL: https://issues.apache.org/jira/browse/HDFS-7611 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.6.0 Reporter: Konstantin Shvachko Assignee: Byron Wong Priority: Critical Attachments: HDFS-7611.000.patch, HDFS-7611.001.patch, blocksNotDeletedTest.patch, testTruncateEditLogLoad.log If quotas are enabled a combination of operations *deleteSnapshot* and *delete* of a file can leave orphaned blocks in the blocksMap on NameNode restart. They are counted as missing on the NameNode, and can prevent NameNode from coming out of safeMode and could cause memory leak during startup. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7353) Raw Erasure Coder API for concrete encoding and decoding
[ https://issues.apache.org/jira/browse/HDFS-7353?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kai Zheng updated HDFS-7353: Attachment: HDFS-7353-v7.patch Updated the patch one more time to use the better name suggested above. Raw Erasure Coder API for concrete encoding and decoding Key: HDFS-7353 URL: https://issues.apache.org/jira/browse/HDFS-7353 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Kai Zheng Assignee: Kai Zheng Fix For: HDFS-EC Attachments: HDFS-7353-v1.patch, HDFS-7353-v2.patch, HDFS-7353-v3.patch, HDFS-7353-v4.patch, HDFS-7353-v5.patch, HDFS-7353-v6.patch, HDFS-7353-v7.patch This is to abstract and define raw erasure coder API across different codes algorithms like RS, XOR and etc. Such API can be implemented by utilizing various library support, such as Intel ISA library and Jerasure library. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7683) Combine usages and percent stats in NameNode UI
[ https://issues.apache.org/jira/browse/HDFS-7683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinayakumar B updated HDFS-7683: Attachment: HDFS-7683-001.patch Attached simple patch Combine usages and percent stats in NameNode UI --- Key: HDFS-7683 URL: https://issues.apache.org/jira/browse/HDFS-7683 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Reporter: Vinayakumar B Assignee: Vinayakumar B Priority: Minor Attachments: HDFS-7683-001.patch In NameNode UI, there are separate rows to display cluster usage, one is in bytes, another one is in percentage. We can combine these two rows to just display percent usage in brackets. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7353) Raw Erasure Coder API for concrete encoding and decoding
[ https://issues.apache.org/jira/browse/HDFS-7353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14293252#comment-14293252 ] Hadoop QA commented on HDFS-7353: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12694748/HDFS-7353-v7.patch against trunk revision 6f9fe76. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:red}-1 eclipse:eclipse{color}. The patch failed to build with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/9342//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9342//console This message is automatically generated. Raw Erasure Coder API for concrete encoding and decoding Key: HDFS-7353 URL: https://issues.apache.org/jira/browse/HDFS-7353 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Kai Zheng Assignee: Kai Zheng Fix For: HDFS-EC Attachments: HDFS-7353-v1.patch, HDFS-7353-v2.patch, HDFS-7353-v3.patch, HDFS-7353-v4.patch, HDFS-7353-v5.patch, HDFS-7353-v6.patch, HDFS-7353-v7.patch This is to abstract and define raw erasure coder API across different codes algorithms like RS, XOR and etc. Such API can be implemented by utilizing various library support, such as Intel ISA library and Jerasure library. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7677) DistributedFileSystem#truncate should resolve symlinks
[ https://issues.apache.org/jira/browse/HDFS-7677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14293166#comment-14293166 ] Hadoop QA commented on HDFS-7677: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12694736/HDFS-7677.002.patch against trunk revision 6f9fe76. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:red}-1 eclipse:eclipse{color}. The patch failed to build with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/9341//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9341//console This message is automatically generated. DistributedFileSystem#truncate should resolve symlinks -- Key: HDFS-7677 URL: https://issues.apache.org/jira/browse/HDFS-7677 Project: Hadoop HDFS Issue Type: Sub-task Components: datanode, namenode Reporter: Yi Liu Assignee: Yi Liu Attachments: HDFS-7677.001.patch, HDFS-7677.002.patch We should resolve the symlinks in DistributedFileSystem#truncate as we do for {{create}}, {{open}}, {{append}} and so on, I don't see any reason not support it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7630) TestConnCache hardcode block size without considering native OS
[ https://issues.apache.org/jira/browse/HDFS-7630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14293183#comment-14293183 ] Hadoop QA commented on HDFS-7630: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12694711/HDFS-7630.002.patch against trunk revision 6f9fe76. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 7 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.server.datanode.TestBlockScanner Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/9338//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9338//console This message is automatically generated. TestConnCache hardcode block size without considering native OS --- Key: HDFS-7630 URL: https://issues.apache.org/jira/browse/HDFS-7630 Project: Hadoop HDFS Issue Type: Bug Components: test Reporter: sam liu Assignee: sam liu Attachments: HDFS-7630.001.patch, HDFS-7630.002.patch TestConnCache hardcode block size with 'BLOCK_SIZE = 4096', however it's incorrect on some platforms. For example, on power platform, the correct value is 65536. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7662) Erasure Coder API for encoding and decoding of block group
[ https://issues.apache.org/jira/browse/HDFS-7662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kai Zheng updated HDFS-7662: Summary: Erasure Coder API for encoding and decoding of block group (was: Erasure Coder API for encoding and decoding of BlockGroup) Erasure Coder API for encoding and decoding of block group -- Key: HDFS-7662 URL: https://issues.apache.org/jira/browse/HDFS-7662 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Kai Zheng Assignee: Kai Zheng This is to define ErasureCoder API for encoding and decoding of BlockGroup. Given a BlockGroup, ErasureCoder extracts data chunks from the blocks and leverages RawErasureCoder defined in HDFS-7353 to perform concrete encoding or decoding. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-5796) The file system browser in the namenode UI requires SPNEGO.
[ https://issues.apache.org/jira/browse/HDFS-5796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14293261#comment-14293261 ] Hadoop QA commented on HDFS-5796: - {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12694722/HDFS-5796.3.patch against trunk revision 6f9fe76. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-common-project/hadoop-auth hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/9339//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9339//console This message is automatically generated. The file system browser in the namenode UI requires SPNEGO. --- Key: HDFS-5796 URL: https://issues.apache.org/jira/browse/HDFS-5796 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.5.0 Reporter: Kihwal Lee Assignee: Arun Suresh Attachments: HDFS-5796.1.patch, HDFS-5796.1.patch, HDFS-5796.2.patch, HDFS-5796.3.patch, HDFS-5796.3.patch After HDFS-5382, the browser makes webhdfs REST calls directly, requiring SPNEGO to work between user's browser and namenode. This won't work if the cluster's security infrastructure is isolated from the regular network. Moreover, SPNEGO is not supposed to be required for user-facing web pages. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7662) Erasure Coder API for encoding and decoding of block group
[ https://issues.apache.org/jira/browse/HDFS-7662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kai Zheng updated HDFS-7662: Attachment: HDFS-7662-v1.patch Uploaded a well prepared patch for review. Erasure Coder API for encoding and decoding of block group -- Key: HDFS-7662 URL: https://issues.apache.org/jira/browse/HDFS-7662 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Kai Zheng Assignee: Kai Zheng Attachments: HDFS-7662-v1.patch This is to define ErasureCoder API for encoding and decoding of BlockGroup. Given a BlockGroup, ErasureCoder extracts data chunks from the blocks and leverages RawErasureCoder defined in HDFS-7353 to perform concrete encoding or decoding. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7675) Unused member DFSClient.spanReceiverHost
[ https://issues.apache.org/jira/browse/HDFS-7675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14293195#comment-14293195 ] Konstantin Shvachko commented on HDFS-7675: --- Thanks for clarifying. I see it as a compile warning, which indicates something is wrong. If the reference is not used there is no reason to keep it in the class. In this case would it be logical to {{closeReceivers()}} when DFSClient is closing? Unused member DFSClient.spanReceiverHost Key: HDFS-7675 URL: https://issues.apache.org/jira/browse/HDFS-7675 Project: Hadoop HDFS Issue Type: Bug Components: dfsclient Affects Versions: 3.0.0 Reporter: Konstantin Shvachko DFSClient.spanReceiverHost is initialised but never used. Could be redundant. This was introduced by HDFS-7055. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7353) Raw Erasure Coder API for concrete encoding and decoding
[ https://issues.apache.org/jira/browse/HDFS-7353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14293229#comment-14293229 ] Kai Zheng commented on HDFS-7353: - Hello [~zhz] and [~szetszwo], or anybody, The patch for HDFS-7662 related to this is also well prepared, would you review it? Thanks! Raw Erasure Coder API for concrete encoding and decoding Key: HDFS-7353 URL: https://issues.apache.org/jira/browse/HDFS-7353 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Kai Zheng Assignee: Kai Zheng Fix For: HDFS-EC Attachments: HDFS-7353-v1.patch, HDFS-7353-v2.patch, HDFS-7353-v3.patch, HDFS-7353-v4.patch, HDFS-7353-v5.patch, HDFS-7353-v6.patch, HDFS-7353-v7.patch This is to abstract and define raw erasure coder API across different codes algorithms like RS, XOR and etc. Such API can be implemented by utilizing various library support, such as Intel ISA library and Jerasure library. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-7683) Combine usages and percent stats in NameNode UI
Vinayakumar B created HDFS-7683: --- Summary: Combine usages and percent stats in NameNode UI Key: HDFS-7683 URL: https://issues.apache.org/jira/browse/HDFS-7683 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Reporter: Vinayakumar B Assignee: Vinayakumar B Priority: Minor In NameNode UI, there are separate rows to display cluster usage, one is in bytes, another one is in percentage. We can combine these two rows to just display percent usage in brackets. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7683) Combine usages and percent stats in NameNode UI
[ https://issues.apache.org/jira/browse/HDFS-7683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinayakumar B updated HDFS-7683: Attachment: HDFS-7683-001.patch last time, QA picked up snapshot. Attaching the patch again for QA Combine usages and percent stats in NameNode UI --- Key: HDFS-7683 URL: https://issues.apache.org/jira/browse/HDFS-7683 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Reporter: Vinayakumar B Assignee: Vinayakumar B Priority: Minor Attachments: 7683-snapshot.jpg, HDFS-7683-001.patch, HDFS-7683-001.patch In NameNode UI, there are separate rows to display cluster usage, one is in bytes, another one is in percentage. We can combine these two rows to just display percent usage in brackets. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-3689) Add support for variable length block
[ https://issues.apache.org/jira/browse/HDFS-3689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14293159#comment-14293159 ] Hadoop QA commented on HDFS-3689: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12694704/HDFS-3689.010.patch against trunk revision 6f9fe76. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 14 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs hadoop-hdfs-project/hadoop-hdfs-nfs: org.apache.hadoop.hdfs.tools.offlineEditsViewer.TestOfflineEditsViewer Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/9336//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9336//console This message is automatically generated. Add support for variable length block - Key: HDFS-3689 URL: https://issues.apache.org/jira/browse/HDFS-3689 Project: Hadoop HDFS Issue Type: New Feature Components: datanode, hdfs-client, namenode Affects Versions: 3.0.0 Reporter: Suresh Srinivas Assignee: Jing Zhao Attachments: HDFS-3689.000.patch, HDFS-3689.001.patch, HDFS-3689.002.patch, HDFS-3689.003.patch, HDFS-3689.003.patch, HDFS-3689.004.patch, HDFS-3689.005.patch, HDFS-3689.006.patch, HDFS-3689.007.patch, HDFS-3689.008.patch, HDFS-3689.008.patch, HDFS-3689.009.patch, HDFS-3689.009.patch, HDFS-3689.010.patch, editsStored Currently HDFS supports fixed length blocks. Supporting variable length block will allow new use cases and features to be built on top of HDFS. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7683) Combine usages and percent stats in NameNode UI
[ https://issues.apache.org/jira/browse/HDFS-7683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinayakumar B updated HDFS-7683: Attachment: 7683-snapshot.jpg Attached the snapshot of before-after patch Combine usages and percent stats in NameNode UI --- Key: HDFS-7683 URL: https://issues.apache.org/jira/browse/HDFS-7683 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Reporter: Vinayakumar B Assignee: Vinayakumar B Priority: Minor Attachments: 7683-snapshot.jpg, HDFS-7683-001.patch In NameNode UI, there are separate rows to display cluster usage, one is in bytes, another one is in percentage. We can combine these two rows to just display percent usage in brackets. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7683) Combine usages and percent stats in NameNode UI
[ https://issues.apache.org/jira/browse/HDFS-7683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinayakumar B updated HDFS-7683: Status: Patch Available (was: Open) Combine usages and percent stats in NameNode UI --- Key: HDFS-7683 URL: https://issues.apache.org/jira/browse/HDFS-7683 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Reporter: Vinayakumar B Assignee: Vinayakumar B Priority: Minor Attachments: HDFS-7683-001.patch In NameNode UI, there are separate rows to display cluster usage, one is in bytes, another one is in percentage. We can combine these two rows to just display percent usage in brackets. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7683) Combine usages and percent stats in NameNode UI
[ https://issues.apache.org/jira/browse/HDFS-7683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14293251#comment-14293251 ] Hadoop QA commented on HDFS-7683: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12694754/7683-snapshot.jpg against trunk revision 6f9fe76. {color:red}-1 patch{color}. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9343//console This message is automatically generated. Combine usages and percent stats in NameNode UI --- Key: HDFS-7683 URL: https://issues.apache.org/jira/browse/HDFS-7683 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Reporter: Vinayakumar B Assignee: Vinayakumar B Priority: Minor Attachments: 7683-snapshot.jpg, HDFS-7683-001.patch In NameNode UI, there are separate rows to display cluster usage, one is in bytes, another one is in percentage. We can combine these two rows to just display percent usage in brackets. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7684) The host:port settings of dfs.namenode.secondary.http-address should be trimmed before use
[ https://issues.apache.org/jira/browse/HDFS-7684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tianyin Xu updated HDFS-7684: - Summary: The host:port settings of dfs.namenode.secondary.http-address should be trimmed before use (was: The Host:Port Settings of dfs.namenode.secondary.http-address should be trimmed before use) The host:port settings of dfs.namenode.secondary.http-address should be trimmed before use -- Key: HDFS-7684 URL: https://issues.apache.org/jira/browse/HDFS-7684 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.4.1, 2.5.1 Reporter: Tianyin Xu With the following setting, property namedfs.namenode.secondary.http-address/name valuemyhostname:50090 /value /property The secondary NameNode could not be started $ hadoop-daemon.sh start secondarynamenode starting secondarynamenode, logging to /home/hadoop/hadoop-2.4.1/logs/hadoop-hadoop-secondarynamenode-xxx.out /home/hadoop/hadoop-2.4.1/bin/hdfs Exception in thread main java.lang.IllegalArgumentException: Does not contain a valid host:port authority: myhostname:50090 at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:196) at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:163) at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:152) at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.getHttpAddress(SecondaryNameNode.java:203) at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.initialize(SecondaryNameNode.java:214) at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.init(SecondaryNameNode.java:192) at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.main(SecondaryNameNode.java:651) We were really confused and misled by the log message: we thought about the DNS problems (changed to IP address but no success) and the network problem (tried to test the connections with no success...) It turned out to be that the setting is not trimmed and the additional space character in the end of the setting caused the problem... OMG!!!... Searching on the Internet, we find we are really not alone. So many users encountered similar trim problems! The following lists a few: http://solaimurugan.blogspot.com/2013/10/hadoop-multi-node-cluster-configuration.html http://stackoverflow.com/questions/11263664/error-while-starting-the-hadoop-using-strat-all-sh https://issues.apache.org/jira/browse/HDFS-2799 https://issues.apache.org/jira/browse/HBASE-6973 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-7684) The Host:Port Settings of dfs.namenode.secondary.http-address should be trimmed before use
Tianyin Xu created HDFS-7684: Summary: The Host:Port Settings of dfs.namenode.secondary.http-address should be trimmed before use Key: HDFS-7684 URL: https://issues.apache.org/jira/browse/HDFS-7684 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.5.1, 2.4.1 Reporter: Tianyin Xu With the following setting, property namedfs.namenode.secondary.http-address/name valuemyhostname:50090 /value /property The secondary NameNode could not be started $ hadoop-daemon.sh start secondarynamenode starting secondarynamenode, logging to /home/hadoop/hadoop-2.4.1/logs/hadoop-hadoop-secondarynamenode-xxx.out /home/hadoop/hadoop-2.4.1/bin/hdfs Exception in thread main java.lang.IllegalArgumentException: Does not contain a valid host:port authority: myhostname:50090 at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:196) at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:163) at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:152) at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.getHttpAddress(SecondaryNameNode.java:203) at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.initialize(SecondaryNameNode.java:214) at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.init(SecondaryNameNode.java:192) at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.main(SecondaryNameNode.java:651) We were really confused and misled by the log message: we thought about the DNS problems (changed to IP address but no success) and the network problem (tried to test the connections with no success...) It turned out to be that the setting is not trimmed and the additional space character in the end of the setting caused the problem... OMG!!!... Searching on the Internet, we find we are really not alone. So many users encountered similar trim problems! The following lists a few: http://solaimurugan.blogspot.com/2013/10/hadoop-multi-node-cluster-configuration.html http://stackoverflow.com/questions/11263664/error-while-starting-the-hadoop-using-strat-all-sh https://issues.apache.org/jira/browse/HDFS-2799 https://issues.apache.org/jira/browse/HBASE-6973 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7353) Raw Erasure Coder API for concrete encoding and decoding
[ https://issues.apache.org/jira/browse/HDFS-7353?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo Nicholas Sze updated HDFS-7353: -- Issue Type: New Feature (was: Sub-task) Parent: (was: HDFS-7285) Raw Erasure Coder API for concrete encoding and decoding Key: HDFS-7353 URL: https://issues.apache.org/jira/browse/HDFS-7353 Project: Hadoop HDFS Issue Type: New Feature Reporter: Kai Zheng Assignee: Kai Zheng Fix For: HDFS-EC Attachments: HDFS-7353-v1.patch, HDFS-7353-v2.patch, HDFS-7353-v3.patch, HDFS-7353-v4.patch, HDFS-7353-v5.patch, HDFS-7353-v6.patch, HDFS-7353-v7.patch This is to abstract and define raw erasure coder API across different codes algorithms like RS, XOR and etc. Such API can be implemented by utilizing various library support, such as Intel ISA library and Jerasure library. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7630) TestConnCache hardcode block size without considering native OS
[ https://issues.apache.org/jira/browse/HDFS-7630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14294212#comment-14294212 ] Arpit Agarwal commented on HDFS-7630: - Thanks [~sam liu]. The modified tests pass on Windows, Linux and OS X so I'd be fine with committing it. However I not sure why the BLOCK_SIZE needs to be the same as the OS page size. Could you please share some more details about the failure? e.g. on Linux TestConnCache passes even if BLOCK_SIZE is set to 2048. TestConnCache hardcode block size without considering native OS --- Key: HDFS-7630 URL: https://issues.apache.org/jira/browse/HDFS-7630 Project: Hadoop HDFS Issue Type: Bug Components: test Reporter: sam liu Assignee: sam liu Attachments: HDFS-7630.001.patch, HDFS-7630.002.patch TestConnCache hardcode block size with 'BLOCK_SIZE = 4096', however it's incorrect on some platforms. For example, on power platform, the correct value is 65536. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7411) Refactor and improve decommissioning logic into DecommissionManager
[ https://issues.apache.org/jira/browse/HDFS-7411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14294224#comment-14294224 ] Tsz Wo Nicholas Sze commented on HDFS-7411: --- dfs.namenode.decommission.nodes.per.interval is a public conf property so that it cannot simply be replaced. We should deprecate first. Refactor and improve decommissioning logic into DecommissionManager --- Key: HDFS-7411 URL: https://issues.apache.org/jira/browse/HDFS-7411 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 2.5.1 Reporter: Andrew Wang Assignee: Andrew Wang Attachments: hdfs-7411.001.patch, hdfs-7411.002.patch, hdfs-7411.003.patch, hdfs-7411.004.patch, hdfs-7411.005.patch, hdfs-7411.006.patch, hdfs-7411.007.patch, hdfs-7411.008.patch, hdfs-7411.009.patch Would be nice to split out decommission logic from DatanodeManager to DecommissionManager. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7626) TestPipelinesFailover hardcode block size without considering native OS
[ https://issues.apache.org/jira/browse/HDFS-7626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Agarwal updated HDFS-7626: Resolution: Duplicate Status: Resolved (was: Patch Available) Resolving as dup of HDFS-7630. TestPipelinesFailover hardcode block size without considering native OS --- Key: HDFS-7626 URL: https://issues.apache.org/jira/browse/HDFS-7626 Project: Hadoop HDFS Issue Type: Bug Components: test Reporter: sam liu Assignee: sam liu Attachments: HDFS-7626.001.patch TestPipelinesFailover hardcode block size with 'BLOCK_SIZE = 4096', however it's incorrect on some platforms. For example, on power platform, the correct value is 65536. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7624) TestFileAppendRestart hardcode block size without considering native OS
[ https://issues.apache.org/jira/browse/HDFS-7624?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Agarwal updated HDFS-7624: Resolution: Duplicate Status: Resolved (was: Patch Available) Resolving as dup of HDFS-7630. TestFileAppendRestart hardcode block size without considering native OS --- Key: HDFS-7624 URL: https://issues.apache.org/jira/browse/HDFS-7624 Project: Hadoop HDFS Issue Type: Bug Components: test Reporter: sam liu Assignee: sam liu Attachments: HDFS-7624.001.patch, HDFS-7624.002.patch TestFileAppendRestart hardcode block size with 'BLOCK_SIZE = 4096', however it's incorrect on some platforms. For example, on power platform, the correct value is 65536. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7629) TestDisableConnCache hardcode block size without considering native OS
[ https://issues.apache.org/jira/browse/HDFS-7629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Agarwal updated HDFS-7629: Resolution: Duplicate Status: Resolved (was: Patch Available) Resolving as dup of HDFS-7630. TestDisableConnCache hardcode block size without considering native OS -- Key: HDFS-7629 URL: https://issues.apache.org/jira/browse/HDFS-7629 Project: Hadoop HDFS Issue Type: Bug Components: test Reporter: sam liu Assignee: sam liu Attachments: HDFS-7629.001.patch TestDisableConnCache hardcode block size with 'BLOCK_SIZE = 4096', however it's incorrect on some platforms. For example, on power platform, the correct value is 65536. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7628) TestNameEditsConfigs hardcode block size without considering native OS
[ https://issues.apache.org/jira/browse/HDFS-7628?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Agarwal updated HDFS-7628: Resolution: Duplicate Status: Resolved (was: Patch Available) Resolving as dup of HDFS-7630. TestNameEditsConfigs hardcode block size without considering native OS -- Key: HDFS-7628 URL: https://issues.apache.org/jira/browse/HDFS-7628 Project: Hadoop HDFS Issue Type: Bug Components: test Reporter: sam liu Assignee: sam liu Attachments: HDFS-7628.001.patch TestNameEditsConfigs hardcode block size with 'BLOCK_SIZE = 4096', however it's incorrect on some platforms. For example, on power platform, the correct value is 65536. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7627) TestCacheDirectives hardcode block size without considering native OS
[ https://issues.apache.org/jira/browse/HDFS-7627?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Agarwal updated HDFS-7627: Resolution: Duplicate Status: Resolved (was: Patch Available) Resolving as dup of HDFS-7630. TestCacheDirectives hardcode block size without considering native OS - Key: HDFS-7627 URL: https://issues.apache.org/jira/browse/HDFS-7627 Project: Hadoop HDFS Issue Type: Bug Components: test Reporter: sam liu Assignee: sam liu Attachments: HDFS-7627.001.patch TestCacheDirectives hardcode block size with 'BLOCK_SIZE = 4096', however it's incorrect on some platforms. For example, on power platform, the correct value is 65536. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7682) {{DistributedFileSystem#getFileChecksum}} of a snapshotted file includes non-snapshotted content
[ https://issues.apache.org/jira/browse/HDFS-7682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14294146#comment-14294146 ] Hadoop QA commented on HDFS-7682: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12694806/HDFS-7682.001.patch against trunk revision f56da3c. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/9347//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9347//console This message is automatically generated. {{DistributedFileSystem#getFileChecksum}} of a snapshotted file includes non-snapshotted content Key: HDFS-7682 URL: https://issues.apache.org/jira/browse/HDFS-7682 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.7.0 Reporter: Charles Lamb Assignee: Charles Lamb Attachments: HDFS-7682.000.patch, HDFS-7682.001.patch DistributedFileSystem#getFileChecksum of a snapshotted file includes non-snapshotted content. The reason why this happens is because DistributedFileSystem#getFileChecksum simply calculates the checksum of all of the CRCs from the blocks in the file. But, in the case of a snapshotted file, we don't want to include data in the checksum that was appended to the last block in the file after the snapshot was taken. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-3689) Add support for variable length block
[ https://issues.apache.org/jira/browse/HDFS-3689?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jing Zhao updated HDFS-3689: Resolution: Fixed Fix Version/s: 3.0.0 Status: Resolved (was: Patch Available) Thanks for the review, Nicholas! I've committed this to trunk. And thanks to all for the discussion. Add support for variable length block - Key: HDFS-3689 URL: https://issues.apache.org/jira/browse/HDFS-3689 Project: Hadoop HDFS Issue Type: New Feature Components: datanode, hdfs-client, namenode Affects Versions: 3.0.0 Reporter: Suresh Srinivas Assignee: Jing Zhao Fix For: 3.0.0 Attachments: HDFS-3689.000.patch, HDFS-3689.001.patch, HDFS-3689.002.patch, HDFS-3689.003.patch, HDFS-3689.003.patch, HDFS-3689.004.patch, HDFS-3689.005.patch, HDFS-3689.006.patch, HDFS-3689.007.patch, HDFS-3689.008.patch, HDFS-3689.008.patch, HDFS-3689.009.patch, HDFS-3689.009.patch, HDFS-3689.010.patch, editsStored Currently HDFS supports fixed length blocks. Supporting variable length block will allow new use cases and features to be built on top of HDFS. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7675) Unused member DFSClient.spanReceiverHost
[ https://issues.apache.org/jira/browse/HDFS-7675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14294231#comment-14294231 ] Konstantin Shvachko commented on HDFS-7675: --- Sounds like you don't need the reference to {{SpanReceiverHost}} in {{DFSClient}}, but you still need to initializes {{SpanReceiverHost#SingletonHolder#INSTANCE}}. May be a static method, which initializes the INSTANCE directly, can be introduced rather than using {{getInstance()}} for {{DFSClient}}? Unused member DFSClient.spanReceiverHost Key: HDFS-7675 URL: https://issues.apache.org/jira/browse/HDFS-7675 Project: Hadoop HDFS Issue Type: Bug Components: dfsclient Affects Versions: 3.0.0 Reporter: Konstantin Shvachko DFSClient.spanReceiverHost is initialised but never used. Could be redundant. This was introduced by HDFS-7055. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-6729) Support maintenance mode for DN
[ https://issues.apache.org/jira/browse/HDFS-6729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lei (Eddy) Xu updated HDFS-6729: Attachment: HDFS-6729.005.patch Updated the patch to fix the failed test. Hey, [~andrew.wang], thanks for your quick review :). Yes. this {{maintenance mode}} is a soft state. NN restarts / failovers are relatively rare events. Even in NN restarts / failover scenarios, NN can treat this DN as stale / dead, which does not sacrifice the durability / availability of DN. Support maintenance mode for DN --- Key: HDFS-6729 URL: https://issues.apache.org/jira/browse/HDFS-6729 Project: Hadoop HDFS Issue Type: Improvement Components: datanode Affects Versions: 2.5.0 Reporter: Lei (Eddy) Xu Assignee: Lei (Eddy) Xu Attachments: HDFS-6729.000.patch, HDFS-6729.001.patch, HDFS-6729.002.patch, HDFS-6729.003.patch, HDFS-6729.004.patch, HDFS-6729.005.patch Some maintenance works (e.g., upgrading RAM or add disks) on DataNode only takes a short amount of time (e.g., 10 minutes). In these cases, the users do not want to report missing blocks on this DN because the DN will be online shortly without data lose. Thus, we need a maintenance mode for a DN so that maintenance work can be carried out on the DN without having to decommission it or the DN being marked as dead. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7018) Implement C interface for libhdfs3
[ https://issues.apache.org/jira/browse/HDFS-7018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14294266#comment-14294266 ] Colin Patrick McCabe commented on HDFS-7018: First off, I apologize for being slow to review this. It was because of the holidays... Re: strdup versus new char\[\]. If you really want to use {{new char[]}} to allocate C-style strings, I guess that's OK. But please do not call the function for doing this {{Strdup}}. People who have been using the real {{strdup}} for many years have a lot of expectations from this function: that it creates a string which is freed with {{free}} (not {{delete}}), that it does not accept {{NULL}} input, and so forth. Instead, call your new function something like {{AllocCharArray}} or something to make it clear what it does. bq. 2) remove PARAMETER_ASSERT Thanks, this is much more readable. bq. comments about exceptions The Google C\+\+ coding style is pretty clear. We do not use C++ exceptions. It doesn't say We do not use C++ exceptions in APIs... it just says we don't use them, period. So we should start eliminating them from libhdfs3 over time, hopefully before the merge happens. There are a bunch of rationales given in the coding style guide and I won't repeat them all here, but just to mention a few: * exceptions make control flow harder to follow * when you add a throw statement to an existing function, you must examine all of its transitive callers to see if they handle it correctly It's not just about compatibility with C code. That is only one reason. {code} +/** * Determine if a file is open for read. * * @param file The HDFS file @@ -332,7 +341,7 @@ extern C { * @return Returns the handle to the open file or NULL on error. */ hdfsFile hdfsOpenFile(hdfsFS fs, const char* path, int flags, - int bufferSize, short replication, tSize blocksize); + int bufferSize, short replication, tOffset blocksize); {code} Do we need to change the type of {{blocksize}} here? That is going to potentially impact {{libhdfs}} and {{libwebhdfs}} {code} /** + * Return error information of last failed operation. + * + * @return A not NULL const string pointer of last error information. + * If last operation finished successfully, + * the returned message is undefined. + */ +const char* hdfsGetLastError(); {code} Can we add a comment like Additional information about the last error encountered in this thread. We should make it clear that this is thread-local information. Also, I would say Successful operations do not clear this message rather than if last operation finished successfully, the returned message is undefined. +1 once those are addressed. Implement C interface for libhdfs3 -- Key: HDFS-7018 URL: https://issues.apache.org/jira/browse/HDFS-7018 Project: Hadoop HDFS Issue Type: Sub-task Components: hdfs-client Reporter: Zhanwei Wang Assignee: Zhanwei Wang Attachments: HDFS-7018-pnative.002.patch, HDFS-7018-pnative.003.patch, HDFS-7018-pnative.004.patch, HDFS-7018.patch Implement C interface for libhdfs3 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7353) Raw Erasure Coder API for concrete encoding and decoding
[ https://issues.apache.org/jira/browse/HDFS-7353?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo Nicholas Sze updated HDFS-7353: -- Hadoop Flags: Reviewed +1 the latest patch looks good. Will move this JIRA as a subtask of HADOOP-11264 since the change is in common. Raw Erasure Coder API for concrete encoding and decoding Key: HDFS-7353 URL: https://issues.apache.org/jira/browse/HDFS-7353 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Kai Zheng Assignee: Kai Zheng Fix For: HDFS-EC Attachments: HDFS-7353-v1.patch, HDFS-7353-v2.patch, HDFS-7353-v3.patch, HDFS-7353-v4.patch, HDFS-7353-v5.patch, HDFS-7353-v6.patch, HDFS-7353-v7.patch This is to abstract and define raw erasure coder API across different codes algorithms like RS, XOR and etc. Such API can be implemented by utilizing various library support, such as Intel ISA library and Jerasure library. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7681) Fix ReplicaInputStream constructor to take InputStreams
[ https://issues.apache.org/jira/browse/HDFS-7681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14294162#comment-14294162 ] Hadoop QA commented on HDFS-7681: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12694807/HDFS-7681.patch against trunk revision f56da3c. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.server.balancer.TestBalancer Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/9346//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9346//console This message is automatically generated. Fix ReplicaInputStream constructor to take InputStreams --- Key: HDFS-7681 URL: https://issues.apache.org/jira/browse/HDFS-7681 Project: Hadoop HDFS Issue Type: Sub-task Components: datanode Affects Versions: 3.0.0 Reporter: Joe Pallas Assignee: Joe Pallas Attachments: HDFS-7681.patch As noted in HDFS-5194, the constructor for {{ReplicaInputStream}} takes {{FileDescriptor}} s that are immediately turned into {{InputStream}} s, while the callers already have {{FileInputStream}} s from which they extract {{FileDescriptor}} s. This seems to have been done as part of a large set of changes to appease findbugs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7625) TestPersistBlocks hardcode block size without considering native OS
[ https://issues.apache.org/jira/browse/HDFS-7625?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Agarwal updated HDFS-7625: Resolution: Duplicate Status: Resolved (was: Patch Available) Resolving as dup of HDFS-7630. TestPersistBlocks hardcode block size without considering native OS --- Key: HDFS-7625 URL: https://issues.apache.org/jira/browse/HDFS-7625 Project: Hadoop HDFS Issue Type: Bug Components: test Reporter: sam liu Assignee: sam liu Attachments: HDFS-7625.001.patch TestPersistBlocks hardcode block size with 'BLOCK_SIZE = 4096', however it's incorrect on some platforms. For example, on power platform, the correct value is 65536. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7411) Refactor and improve decommissioning logic into DecommissionManager
[ https://issues.apache.org/jira/browse/HDFS-7411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14294241#comment-14294241 ] Tsz Wo Nicholas Sze commented on HDFS-7411: --- - Why adding a new class GenericHdfsTestUtils? It only has a very specific method setNameNodeLogLevel. Why not putting it in DFSTestUtil? Refactor and improve decommissioning logic into DecommissionManager --- Key: HDFS-7411 URL: https://issues.apache.org/jira/browse/HDFS-7411 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 2.5.1 Reporter: Andrew Wang Assignee: Andrew Wang Attachments: hdfs-7411.001.patch, hdfs-7411.002.patch, hdfs-7411.003.patch, hdfs-7411.004.patch, hdfs-7411.005.patch, hdfs-7411.006.patch, hdfs-7411.007.patch, hdfs-7411.008.patch, hdfs-7411.009.patch Would be nice to split out decommission logic from DatanodeManager to DecommissionManager. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7285) Erasure Coding Support inside HDFS
[ https://issues.apache.org/jira/browse/HDFS-7285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14294150#comment-14294150 ] Tsz Wo Nicholas Sze commented on HDFS-7285: --- I am fine to use erasurecode although I prefer erasure_code. Erasure Coding Support inside HDFS -- Key: HDFS-7285 URL: https://issues.apache.org/jira/browse/HDFS-7285 Project: Hadoop HDFS Issue Type: New Feature Reporter: Weihua Jiang Assignee: Zhe Zhang Attachments: ECAnalyzer.py, ECParser.py, HDFSErasureCodingDesign-20141028.pdf, HDFSErasureCodingDesign-20141217.pdf, fsimage-analysis-20150105.pdf Erasure Coding (EC) can greatly reduce the storage overhead without sacrifice of data reliability, comparing to the existing HDFS 3-replica approach. For example, if we use a 10+4 Reed Solomon coding, we can allow loss of 4 blocks, with storage overhead only being 40%. This makes EC a quite attractive alternative for big data storage, particularly for cold data. Facebook had a related open source project called HDFS-RAID. It used to be one of the contribute packages in HDFS but had been removed since Hadoop 2.0 for maintain reason. The drawbacks are: 1) it is on top of HDFS and depends on MapReduce to do encoding and decoding tasks; 2) it can only be used for cold files that are intended not to be appended anymore; 3) the pure Java EC coding implementation is extremely slow in practical use. Due to these, it might not be a good idea to just bring HDFS-RAID back. We (Intel and Cloudera) are working on a design to build EC into HDFS that gets rid of any external dependencies, makes it self-contained and independently maintained. This design lays the EC feature on the storage type support and considers compatible with existing HDFS features like caching, snapshot, encryption, high availability and etc. This design will also support different EC coding schemes, implementations and policies for different deployment scenarios. By utilizing advanced libraries (e.g. Intel ISA-L library), an implementation can greatly improve the performance of EC encoding/decoding and makes the EC solution even more attractive. We will post the design document soon. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7411) Refactor and improve decommissioning logic into DecommissionManager
[ https://issues.apache.org/jira/browse/HDFS-7411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14294185#comment-14294185 ] Tsz Wo Nicholas Sze commented on HDFS-7411: --- If you look at version 2 of the patch, you can see the initial refactor, which consisted of moving some methods from BlockManager to DecomManager. I didn't bother splitting this though since it ended up not being very interesting. DecomManager is also basically all new code, so the old code would be moved and then subsequently deleted if we split it. It seems not true that DecommissionManager is all new code. Quite a few methods such as logBlockReplicationInfo(..), startDecommission(..) and stopDecommission(..) are moved from the existing code. This is what I mean, by combining refactoring and improvement, it makes reviewing the patch harder and discourages collaboration. The patch here is unnecessarily big. Refactor and improve decommissioning logic into DecommissionManager --- Key: HDFS-7411 URL: https://issues.apache.org/jira/browse/HDFS-7411 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 2.5.1 Reporter: Andrew Wang Assignee: Andrew Wang Attachments: hdfs-7411.001.patch, hdfs-7411.002.patch, hdfs-7411.003.patch, hdfs-7411.004.patch, hdfs-7411.005.patch, hdfs-7411.006.patch, hdfs-7411.007.patch, hdfs-7411.008.patch, hdfs-7411.009.patch Would be nice to split out decommission logic from DatanodeManager to DecommissionManager. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-3689) Add support for variable length block
[ https://issues.apache.org/jira/browse/HDFS-3689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14294190#comment-14294190 ] Hudson commented on HDFS-3689: -- SUCCESS: Integrated in Hadoop-trunk-Commit #6940 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/6940/]) HDFS-3689. Add support for variable length block. Contributed by Jing Zhao. (jing9: rev 2848db814a98b83e7546f65a2751e56fb5b2dbe0) * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSInotifyEventInputStream.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestFileAppend2.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestHDFSConcat.java * hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/FSOutputSummer.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/PBHelper.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestFileAppendRestart.java * hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/CreateFlag.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSOutputStream.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSEditLogOpCodes.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/INodeFile.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestLeaseRecovery.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSClient.java * hadoop-hdfs-project/hadoop-hdfs/src/test/resources/editsStored.xml * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/web/webhdfs/WebHdfsHandler.java * hadoop-hdfs-project/hadoop-hdfs/src/main/proto/inotify.proto * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/DFSTestUtil.java * hadoop-hdfs-project/hadoop-hdfs/src/main/proto/ClientNamenodeProtocol.proto * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/ClientProtocol.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/inotify/Event.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/client/HdfsDataOutputStream.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/TestLazyPersistFiles.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/ha/TestRetryCacheWithHA.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/AppendTestUtil.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestNamenodeRetryCache.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirConcatOp.java * hadoop-hdfs-project/hadoop-hdfs-nfs/src/main/java/org/apache/hadoop/hdfs/nfs/nfs3/WriteManager.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSEditLogOp.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSEditLog.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNodeLayoutVersion.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/ClientNamenodeProtocolServerSideTranslatorPB.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/ClientNamenodeProtocolTranslatorPB.java * hadoop-hdfs-project/hadoop-hdfs/src/test/resources/editsStored * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirectory.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNodeRpcServer.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSEditLogLoader.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestFileAppend3.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/InotifyFSEditLogOpTranslator.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DistributedFileSystem.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestFileAppend.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestHFlush.java Add support for variable length block - Key: HDFS-3689 URL: https://issues.apache.org/jira/browse/HDFS-3689 Project: Hadoop HDFS Issue Type: New Feature Components: datanode, hdfs-client, namenode Affects Versions: 3.0.0
[jira] [Commented] (HDFS-7675) Unused member DFSClient.spanReceiverHost
[ https://issues.apache.org/jira/browse/HDFS-7675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14294269#comment-14294269 ] Colin Patrick McCabe commented on HDFS-7675: I don't see a good reason to use a static method here. That would potentially create race conditions where we'd try to initialize the {{SpanReceiverHost}} before some other stuff was ready. If you really want to clean up the compiler error, just keep the function call but don't store the result. Unused member DFSClient.spanReceiverHost Key: HDFS-7675 URL: https://issues.apache.org/jira/browse/HDFS-7675 Project: Hadoop HDFS Issue Type: Bug Components: dfsclient Affects Versions: 3.0.0 Reporter: Konstantin Shvachko DFSClient.spanReceiverHost is initialised but never used. Could be redundant. This was introduced by HDFS-7055. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-5194) Robust support for alternate FsDatasetSpi implementations
[ https://issues.apache.org/jira/browse/HDFS-5194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14294204#comment-14294204 ] Joe Pallas commented on HDFS-5194: -- Some of the issues identified in the document relating to scanners ( {{BlockPoolSliceScanner}} and {{DirectoryScanner}} ) are addressed by HDFS-7430. Robust support for alternate FsDatasetSpi implementations - Key: HDFS-5194 URL: https://issues.apache.org/jira/browse/HDFS-5194 Project: Hadoop HDFS Issue Type: Improvement Components: datanode, hdfs-client Reporter: David Powell Assignee: David Powell Priority: Minor Attachments: HDFS-5194.design.01222014.pdf, HDFS-5194.design.09112013.pdf, HDFS-5194.patch.09112013 The existing FsDatasetSpi interface is well-positioned to permit extending Hadoop to run natively on non-traditional storage architectures. Before this can be done, however, a number of gaps need to be addressed. This JIRA documents those gaps, suggests some solutions, and puts forth a sample implementation of some of the key changes needed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7285) Erasure Coding Support inside HDFS
[ https://issues.apache.org/jira/browse/HDFS-7285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14294282#comment-14294282 ] Kai Zheng commented on HDFS-7285: - Thanks for your confirm. I would use erasurecode in the style like ones I found in the codebase, datatransfer, blockmanagement and many. Erasure Coding Support inside HDFS -- Key: HDFS-7285 URL: https://issues.apache.org/jira/browse/HDFS-7285 Project: Hadoop HDFS Issue Type: New Feature Reporter: Weihua Jiang Assignee: Zhe Zhang Attachments: ECAnalyzer.py, ECParser.py, HDFSErasureCodingDesign-20141028.pdf, HDFSErasureCodingDesign-20141217.pdf, fsimage-analysis-20150105.pdf Erasure Coding (EC) can greatly reduce the storage overhead without sacrifice of data reliability, comparing to the existing HDFS 3-replica approach. For example, if we use a 10+4 Reed Solomon coding, we can allow loss of 4 blocks, with storage overhead only being 40%. This makes EC a quite attractive alternative for big data storage, particularly for cold data. Facebook had a related open source project called HDFS-RAID. It used to be one of the contribute packages in HDFS but had been removed since Hadoop 2.0 for maintain reason. The drawbacks are: 1) it is on top of HDFS and depends on MapReduce to do encoding and decoding tasks; 2) it can only be used for cold files that are intended not to be appended anymore; 3) the pure Java EC coding implementation is extremely slow in practical use. Due to these, it might not be a good idea to just bring HDFS-RAID back. We (Intel and Cloudera) are working on a design to build EC into HDFS that gets rid of any external dependencies, makes it self-contained and independently maintained. This design lays the EC feature on the storage type support and considers compatible with existing HDFS features like caching, snapshot, encryption, high availability and etc. This design will also support different EC coding schemes, implementations and policies for different deployment scenarios. By utilizing advanced libraries (e.g. Intel ISA-L library), an implementation can greatly improve the performance of EC encoding/decoding and makes the EC solution even more attractive. We will post the design document soon. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7611) deleteSnapshot and delete of a file can leave orphaned blocks in the blocksMap on NameNode restart.
[ https://issues.apache.org/jira/browse/HDFS-7611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14294294#comment-14294294 ] Hadoop QA commented on HDFS-7611: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12694816/HDFS-7611.002.patch against trunk revision f56da3c. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 2 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.server.balancer.TestBalancer org.apache.hadoop.hdfs.server.namenode.ha.TestHAAppend Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/9348//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9348//console This message is automatically generated. deleteSnapshot and delete of a file can leave orphaned blocks in the blocksMap on NameNode restart. --- Key: HDFS-7611 URL: https://issues.apache.org/jira/browse/HDFS-7611 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.6.0 Reporter: Konstantin Shvachko Assignee: Byron Wong Priority: Critical Attachments: HDFS-7611.000.patch, HDFS-7611.001.patch, HDFS-7611.002.patch, blocksNotDeletedTest.patch, testTruncateEditLogLoad.log If quotas are enabled a combination of operations *deleteSnapshot* and *delete* of a file can leave orphaned blocks in the blocksMap on NameNode restart. They are counted as missing on the NameNode, and can prevent NameNode from coming out of safeMode and could cause memory leak during startup. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7662) Erasure Coder API for encoding and decoding of block group
[ https://issues.apache.org/jira/browse/HDFS-7662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14294423#comment-14294423 ] Zhe Zhang commented on HDFS-7662: - Thanks [~drankye]. Some high level comments: # {{coder}} and {{rawcoder}} packages should be better named and placed. #* I think the difference is that {{rawcoder}} works on packet, or chunk units, while {{coder}} works with HDFS blocks and block groups. If that's the case, is it better to keep {{rawcoder}} under hadoop common, and move {{coder}} to hdfs? #* Maybe change {{coder}} to {{blockcoder}}? # {{ECBlock}} class has no pointer to identify an HDFS block. How should the client / DN specify what HDFS blocks to encode / decode? # Could you post an example how to use the callback class under this JIRA? Erasure Coder API for encoding and decoding of block group -- Key: HDFS-7662 URL: https://issues.apache.org/jira/browse/HDFS-7662 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Kai Zheng Assignee: Kai Zheng Attachments: HDFS-7662-v1.patch This is to define ErasureCoder API for encoding and decoding of BlockGroup. Given a BlockGroup, ErasureCoder extracts data chunks from the blocks and leverages RawErasureCoder defined in HDFS-7353 to perform concrete encoding or decoding. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7675) Unused member DFSClient.spanReceiverHost
[ https://issues.apache.org/jira/browse/HDFS-7675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Colin Patrick McCabe updated HDFS-7675: --- Description: {{DFSClient#spanReceiverHost}} is initialised but never used. (was: DFSClient.spanReceiverHost is initialised but never used. Could be redundant. This was introduced by HDFS-7055.) Unused member DFSClient.spanReceiverHost Key: HDFS-7675 URL: https://issues.apache.org/jira/browse/HDFS-7675 Project: Hadoop HDFS Issue Type: Bug Components: dfsclient Affects Versions: 3.0.0 Reporter: Konstantin Shvachko Assignee: Colin Patrick McCabe Attachments: HDFS-7675.001.patch {{DFSClient#spanReceiverHost}} is initialised but never used. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6729) Support maintenance mode for DN
[ https://issues.apache.org/jira/browse/HDFS-6729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14294539#comment-14294539 ] Hadoop QA commented on HDFS-6729: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12694856/HDFS-6729.005.patch against trunk revision 8bf6f0b. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 3 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.TestLeaseRecovery2 Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/9350//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9350//console This message is automatically generated. Support maintenance mode for DN --- Key: HDFS-6729 URL: https://issues.apache.org/jira/browse/HDFS-6729 Project: Hadoop HDFS Issue Type: Improvement Components: datanode Affects Versions: 2.5.0 Reporter: Lei (Eddy) Xu Assignee: Lei (Eddy) Xu Attachments: HDFS-6729.000.patch, HDFS-6729.001.patch, HDFS-6729.002.patch, HDFS-6729.003.patch, HDFS-6729.004.patch, HDFS-6729.005.patch Some maintenance works (e.g., upgrading RAM or add disks) on DataNode only takes a short amount of time (e.g., 10 minutes). In these cases, the users do not want to report missing blocks on this DN because the DN will be online shortly without data lose. Thus, we need a maintenance mode for a DN so that maintenance work can be carried out on the DN without having to decommission it or the DN being marked as dead. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7675) Unused member DFSClient.spanReceiverHost
[ https://issues.apache.org/jira/browse/HDFS-7675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Colin Patrick McCabe updated HDFS-7675: --- Status: Patch Available (was: Open) Unused member DFSClient.spanReceiverHost Key: HDFS-7675 URL: https://issues.apache.org/jira/browse/HDFS-7675 Project: Hadoop HDFS Issue Type: Bug Components: dfsclient Affects Versions: 3.0.0 Reporter: Konstantin Shvachko Assignee: Colin Patrick McCabe Attachments: HDFS-7675.001.patch DFSClient.spanReceiverHost is initialised but never used. Could be redundant. This was introduced by HDFS-7055. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7675) Unused member DFSClient.spanReceiverHost
[ https://issues.apache.org/jira/browse/HDFS-7675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Colin Patrick McCabe updated HDFS-7675: --- Priority: Trivial (was: Major) Unused member DFSClient.spanReceiverHost Key: HDFS-7675 URL: https://issues.apache.org/jira/browse/HDFS-7675 Project: Hadoop HDFS Issue Type: Bug Components: dfsclient Affects Versions: 3.0.0 Reporter: Konstantin Shvachko Assignee: Colin Patrick McCabe Priority: Trivial Attachments: HDFS-7675.001.patch {{DFSClient#spanReceiverHost}} is initialised but never used. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7677) DistributedFileSystem#truncate should resolve symlinks
[ https://issues.apache.org/jira/browse/HDFS-7677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yi Liu updated HDFS-7677: - Resolution: Fixed Fix Version/s: 2.7.0 Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) Committed to trunk and branch-2. Thanks [~shv] for review, also thanks [~cmccabe] for comments. DistributedFileSystem#truncate should resolve symlinks -- Key: HDFS-7677 URL: https://issues.apache.org/jira/browse/HDFS-7677 Project: Hadoop HDFS Issue Type: Sub-task Components: datanode, namenode Reporter: Yi Liu Assignee: Yi Liu Fix For: 2.7.0 Attachments: HDFS-7677.001.patch, HDFS-7677.002.patch We should resolve the symlinks in DistributedFileSystem#truncate as we do for {{create}}, {{open}}, {{append}} and so on, I don't see any reason not support it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7677) DistributedFileSystem#truncate should resolve symlinks
[ https://issues.apache.org/jira/browse/HDFS-7677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14294513#comment-14294513 ] Hudson commented on HDFS-7677: -- FAILURE: Integrated in Hadoop-trunk-Commit #6947 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/6947/]) HDFS-7677. DistributedFileSystem#truncate should resolve symlinks. (yliu) (yliu: rev 9ca565e9704d236ce839c0138d82d54453d793fb) * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestFileTruncate.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DistributedFileSystem.java DistributedFileSystem#truncate should resolve symlinks -- Key: HDFS-7677 URL: https://issues.apache.org/jira/browse/HDFS-7677 Project: Hadoop HDFS Issue Type: Sub-task Components: datanode, namenode Reporter: Yi Liu Assignee: Yi Liu Fix For: 2.7.0 Attachments: HDFS-7677.001.patch, HDFS-7677.002.patch We should resolve the symlinks in DistributedFileSystem#truncate as we do for {{create}}, {{open}}, {{append}} and so on, I don't see any reason not support it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7675) Unused member DFSClient.spanReceiverHost
[ https://issues.apache.org/jira/browse/HDFS-7675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Colin Patrick McCabe updated HDFS-7675: --- Attachment: HDFS-7675.001.patch This patch will fix the warning. We can file a follow-up JIRA to reference-count {{SpanReceiverHost}} objects. Unused member DFSClient.spanReceiverHost Key: HDFS-7675 URL: https://issues.apache.org/jira/browse/HDFS-7675 Project: Hadoop HDFS Issue Type: Bug Components: dfsclient Affects Versions: 3.0.0 Reporter: Konstantin Shvachko Attachments: HDFS-7675.001.patch DFSClient.spanReceiverHost is initialised but never used. Could be redundant. This was introduced by HDFS-7055. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7175) Client-side SocketTimeoutException during Fsck
[ https://issues.apache.org/jira/browse/HDFS-7175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14294591#comment-14294591 ] Hadoop QA commented on HDFS-7175: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12673576/HDFS-7175.3.patch against trunk revision 0a05ae1. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.server.datanode.TestBlockScanner The following test timeouts occurred in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.TestDatanodeDeath Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/9351//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9351//console This message is automatically generated. Client-side SocketTimeoutException during Fsck -- Key: HDFS-7175 URL: https://issues.apache.org/jira/browse/HDFS-7175 Project: Hadoop HDFS Issue Type: Bug Components: namenode Reporter: Carl Steinbach Assignee: Akira AJISAKA Attachments: HDFS-7175.2.patch, HDFS-7175.3.patch, HDFS-7175.patch, HDFS-7175.patch HDFS-2538 disabled status reporting for the fsck command (it can optionally be enabled with the -showprogress option). We have observed that without status reporting the client will abort with read timeout: {noformat} [hdfs@lva1-hcl0030 ~]$ hdfs fsck / Connecting to namenode via http://lva1-tarocknn01.grid.linkedin.com:50070 14/09/30 06:03:41 WARN security.UserGroupInformation: PriviledgedActionException as:h...@grid.linkedin.com (auth:KERBEROS) cause:java.net.SocketTimeoutException: Read timed out Exception in thread main java.net.SocketTimeoutException: Read timed out at java.net.SocketInputStream.socketRead0(Native Method) at java.net.SocketInputStream.read(SocketInputStream.java:152) at java.net.SocketInputStream.read(SocketInputStream.java:122) at java.io.BufferedInputStream.fill(BufferedInputStream.java:235) at java.io.BufferedInputStream.read1(BufferedInputStream.java:275) at java.io.BufferedInputStream.read(BufferedInputStream.java:334) at sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:687) at sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:633) at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1323) at org.apache.hadoop.hdfs.tools.DFSck.doWork(DFSck.java:312) at org.apache.hadoop.hdfs.tools.DFSck.access$000(DFSck.java:72) at org.apache.hadoop.hdfs.tools.DFSck$1.run(DFSck.java:149) at org.apache.hadoop.hdfs.tools.DFSck$1.run(DFSck.java:146) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548) at org.apache.hadoop.hdfs.tools.DFSck.run(DFSck.java:145) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84) at org.apache.hadoop.hdfs.tools.DFSck.main(DFSck.java:346) {noformat} Since there's nothing for the client to read it will abort if the time required to complete the fsck operation is longer than the client's read timeout setting. I can think of a couple ways to fix this: # Set an infinite read timeout on the client side (not a good idea!). # Have the server-side write (and flush) zeros to the wire and instruct the client to ignore these characters instead of echoing them. # It's possible that flushing an empty buffer on the server-side will trigger an HTTP response with a zero length payload. This may be enough to keep the client from hanging up. -- This
[jira] [Updated] (HDFS-6651) Deletion failure can leak inodes permanently.
[ https://issues.apache.org/jira/browse/HDFS-6651?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jing Zhao updated HDFS-6651: Attachment: HDFS-6651.001.patch Rebase the patch. Deletion failure can leak inodes permanently. - Key: HDFS-6651 URL: https://issues.apache.org/jira/browse/HDFS-6651 Project: Hadoop HDFS Issue Type: Bug Reporter: Kihwal Lee Assignee: Jing Zhao Priority: Critical Attachments: HDFS-6651.000.patch, HDFS-6651.001.patch As discussed in HDFS-6618, if a deletion of tree fails in the middle, any collected inodes and blocks will not be removed from {{INodeMap}} and {{BlocksMap}}. Since fsimage is saved by iterating over {{INodeMap}}, the leak will persist across name node restart. Although blanked out inodes will not have reference to blocks, blocks will still refer to the inode as {{BlockCollection}}. As long as it is not null, blocks will live on. The leaked blocks from blanked out inodes will go away after restart. Options (when delete fails in the middle) - Complete the partial delete: edit log the partial delete and remove inodes and blocks. - Somehow undo the partial delete. - Check quota for snapshot diff beforehand for the whole subtree. - Ignore quota check during delete even if snapshot is present. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7018) Implement C interface for libhdfs3
[ https://issues.apache.org/jira/browse/HDFS-7018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14294583#comment-14294583 ] Zhanwei Wang commented on HDFS-7018: Hi [~cmccabe] About using the exception in C\+\+, Google also list many benefits and give a conclusion {{the benefits of using exceptions outweigh the costs, especially in new projects}}. Anyway, it's a controversial topic and not related to this JIRA. We can discuss this issue in another JIRA and make this JIRA clear. bq. Do we need to change the type of blocksize here? That is going to potentially impact libhdfs and libwebhdfs {{blocksize}} is 64bit integer in JAVA code and JAVA API. I cannot find any reason to restrict it to 32bit in C API. About the impact: 1) This API change is compatible, so existing application do not need to change. 2) The libhdfs code do not need to change, it use {{jlong}} which is 64bit. 3) The libwebhdfs code is more complicated. It use {{tSize}} and {{size_t}} internally in different place to hold block size. So libwebhdfs needs more code change if we change blocksize to 64bit. I'm not sure if it is good to make such API change in this JIRA. What's your opinion? Implement C interface for libhdfs3 -- Key: HDFS-7018 URL: https://issues.apache.org/jira/browse/HDFS-7018 Project: Hadoop HDFS Issue Type: Sub-task Components: hdfs-client Reporter: Zhanwei Wang Assignee: Zhanwei Wang Attachments: HDFS-7018-pnative.002.patch, HDFS-7018-pnative.003.patch, HDFS-7018-pnative.004.patch, HDFS-7018.patch Implement C interface for libhdfs3 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7675) Unused member DFSClient.spanReceiverHost
[ https://issues.apache.org/jira/browse/HDFS-7675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yi Liu updated HDFS-7675: - Assignee: Colin Patrick McCabe Unused member DFSClient.spanReceiverHost Key: HDFS-7675 URL: https://issues.apache.org/jira/browse/HDFS-7675 Project: Hadoop HDFS Issue Type: Bug Components: dfsclient Affects Versions: 3.0.0 Reporter: Konstantin Shvachko Assignee: Colin Patrick McCabe Attachments: HDFS-7675.001.patch DFSClient.spanReceiverHost is initialised but never used. Could be redundant. This was introduced by HDFS-7055. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7675) Unused member DFSClient.spanReceiverHost
[ https://issues.apache.org/jira/browse/HDFS-7675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14294534#comment-14294534 ] Yi Liu commented on HDFS-7675: -- +1, Thanks Colin Unused member DFSClient.spanReceiverHost Key: HDFS-7675 URL: https://issues.apache.org/jira/browse/HDFS-7675 Project: Hadoop HDFS Issue Type: Bug Components: dfsclient Affects Versions: 3.0.0 Reporter: Konstantin Shvachko Attachments: HDFS-7675.001.patch DFSClient.spanReceiverHost is initialised but never used. Could be redundant. This was introduced by HDFS-7055. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7655) Expose truncate API for Web HDFS
[ https://issues.apache.org/jira/browse/HDFS-7655?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yi Liu updated HDFS-7655: - Attachment: HDFS-7655.002.patch Thanks Uma for review. Update the patch to address the comments. {quote} We need to update documentation for this in WebHdfs section after Append API ? {quote} Will do it in a follow up JIRA. Expose truncate API for Web HDFS Key: HDFS-7655 URL: https://issues.apache.org/jira/browse/HDFS-7655 Project: Hadoop HDFS Issue Type: Sub-task Components: datanode, namenode Reporter: Yi Liu Assignee: Yi Liu Attachments: HDFS-7655.001.patch, HDFS-7655.002.patch This JIRA is to expose truncate API for Web HDFS. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7224) Allow reuse of NN connections via webhdfs
[ https://issues.apache.org/jira/browse/HDFS-7224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14293598#comment-14293598 ] Hudson commented on HDFS-7224: -- SUCCESS: Integrated in Hadoop-Hdfs-trunk #2018 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/2018/]) HDFS-7224. Allow reuse of NN connections via webhdfs. Contributed by Eric Payne (kihwal: rev 2b0fa20f69417326a92beac10ffa072db2616e73) * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/web/TestFSMainOperationsWebHdfs.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/WebHdfsFileSystem.java Allow reuse of NN connections via webhdfs - Key: HDFS-7224 URL: https://issues.apache.org/jira/browse/HDFS-7224 Project: Hadoop HDFS Issue Type: Bug Components: webhdfs Affects Versions: 2.5.0 Reporter: Eric Payne Assignee: Eric Payne Fix For: 2.7.0 Attachments: HDFS-7224.v1.201410301923.txt, HDFS-7224.v2.201410312033.txt, HDFS-7224.v3.txt, HDFS-7224.v4.txt In very large clusters, the webhdfs client could get bind exceptions because it runs out of ephemeral ports. This could happen when using webhdfs to talk to the NN in order to do list globbing of a huge amount of files. WebHdfsFileSystem#jsonParse gets the input/error stream from the connection, but never closes the stream. Since it's not closed, the JVM thinks the stream may still be transferring data, so the next time through this code, it has to get a new connection rather than reusing an existing one. The lack of connection reuse has poor latency and adds too much overhead to the NN. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-49) MiniDFSCluster.stopDataNode will always shut down a node in the cluster if a matching name is not found
[ https://issues.apache.org/jira/browse/HDFS-49?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14293604#comment-14293604 ] Hudson commented on HDFS-49: SUCCESS: Integrated in Hadoop-Hdfs-trunk #2018 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/2018/]) HDFS-49. MiniDFSCluster.stopDataNode will always shut down a node in the cluster if a matching name is not found. (stevel) (stevel: rev 0da53a37ec46b887f441df98c6986b31fa7671a2) * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/MiniDFSCluster.java MiniDFSCluster.stopDataNode will always shut down a node in the cluster if a matching name is not found --- Key: HDFS-49 URL: https://issues.apache.org/jira/browse/HDFS-49 Project: Hadoop HDFS Issue Type: Bug Components: test Affects Versions: 0.20.204.0, 0.20.205.0, 1.1.0 Reporter: Steve Loughran Assignee: Steve Loughran Priority: Minor Labels: codereview, newbie Fix For: 2.7.0 Attachments: HDFS-49-002.patch, hdfs-49.patch Original Estimate: 0.5h Remaining Estimate: 0.5h The stopDataNode method will shut down the last node in the list of nodes, if one matching a specific name is not found This is possibly not what was intended. Better to return false or fail in some other manner if the named node was not located synchronized boolean stopDataNode(String name) { int i; for (i = 0; i dataNodes.size(); i++) { DataNode dn = dataNodes.get(i).datanode; if (dn.dnRegistration.getName().equals(name)) { break; } } return stopDataNode(i); } -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-4681) TestBlocksWithNotEnoughRacks#testCorruptBlockRereplicatedAcrossRacks fails using IBM java
[ https://issues.apache.org/jira/browse/HDFS-4681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14293627#comment-14293627 ] Hadoop QA commented on HDFS-4681: - {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12694769/HDFS-4681-v2.patch against trunk revision 0da53a3. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 3 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-hdfs-project/hadoop-hdfs. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/9345//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9345//console This message is automatically generated. TestBlocksWithNotEnoughRacks#testCorruptBlockRereplicatedAcrossRacks fails using IBM java - Key: HDFS-4681 URL: https://issues.apache.org/jira/browse/HDFS-4681 Project: Hadoop HDFS Issue Type: Bug Components: test Affects Versions: 2.5.2 Environment: PowerPC Big Endian architecture Reporter: Tian Hong Wang Assignee: Suresh Srinivas Attachments: HDFS-4681-v1.patch, HDFS-4681-v2.patch, HDFS-4681.patch TestBlocksWithNotEnoughRacks unit test fails with the following error message: testCorruptBlockRereplicatedAcrossRacks(org.apache.hadoop.hdfs.server.blockmanagement.TestBlocksWithNotEnoughRacks) Time elapsed: 8997 sec FAILURE! org.junit.ComparisonFailure: Corrupt replica expected:...��^EI�u�[�{���[$�\hF�[�R{O�L^S��g�#�O����Wv��6u4Hd)FaŔ��^W�0��H/�^ZU^@�6�02���:)$�{|�^@�-���|GvW��7g �/M��[U!eF�^N^?�4pR�d��|��Ŵ7j^O^Sh�^@�nu�(�^C^Y�;I�Q�K^Oc��� oKtE�*�^\3u��]Ē:mŭ^^y�^H��_^T�^ZS4�7�C�^G�_���\|^W�vo���zgU�lmJ)_vq~�+^Mo^G^O�W}�.�4 ��6b�S�G�^?��m4FW#^@ D5��}�^Z�^]���mfR^G#T-�N��̋�p���`�~��`�^F;�^C] but was:...��^EI�u�[�{���[$�\hF�[R{O�L^S��g�#�O����Wv��6u4Hd)FaŔ��^W�0��H/�^ZU^@�6�02�:)$�{|�^@�-���|GvW��7g �/M�[U!eF�^N^?�4pR�d��|��Ŵ7j^O^Sh�^@�nu�(�^C^Y�;I�Q�K^Oc��� oKtE�*�^\3u��]Ē:mŭ^^y���^H��_^T�^ZS���4�7�C�^G�_���\|^W�vo���zgU�lmJ)_vq~�+^Mo^G^O�W}�.�4 ��6b�S�G�^?��m4FW#^@ D5��}�^Z�^]���mfR^G#T-�N�̋�p���`�~��`�^F;�] at org.junit.Assert.assertEquals(Assert.java:123) at org.apache.hadoop.hdfs.server.blockmanagement.TestBlocksWithNotEnoughRacks.testCorruptBlockRereplicatedAcrossRacks(TestBlocksWithNotEnoughRacks.java:229) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-49) MiniDFSCluster.stopDataNode will always shut down a node in the cluster if a matching name is not found
[ https://issues.apache.org/jira/browse/HDFS-49?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14293649#comment-14293649 ] Hudson commented on HDFS-49: FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #87 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/87/]) HDFS-49. MiniDFSCluster.stopDataNode will always shut down a node in the cluster if a matching name is not found. (stevel) (stevel: rev 0da53a37ec46b887f441df98c6986b31fa7671a2) * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/MiniDFSCluster.java MiniDFSCluster.stopDataNode will always shut down a node in the cluster if a matching name is not found --- Key: HDFS-49 URL: https://issues.apache.org/jira/browse/HDFS-49 Project: Hadoop HDFS Issue Type: Bug Components: test Affects Versions: 0.20.204.0, 0.20.205.0, 1.1.0 Reporter: Steve Loughran Assignee: Steve Loughran Priority: Minor Labels: codereview, newbie Fix For: 2.7.0 Attachments: HDFS-49-002.patch, hdfs-49.patch Original Estimate: 0.5h Remaining Estimate: 0.5h The stopDataNode method will shut down the last node in the list of nodes, if one matching a specific name is not found This is possibly not what was intended. Better to return false or fail in some other manner if the named node was not located synchronized boolean stopDataNode(String name) { int i; for (i = 0; i dataNodes.size(); i++) { DataNode dn = dataNodes.get(i).datanode; if (dn.dnRegistration.getName().equals(name)) { break; } } return stopDataNode(i); } -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7224) Allow reuse of NN connections via webhdfs
[ https://issues.apache.org/jira/browse/HDFS-7224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14293642#comment-14293642 ] Hudson commented on HDFS-7224: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #87 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/87/]) HDFS-7224. Allow reuse of NN connections via webhdfs. Contributed by Eric Payne (kihwal: rev 2b0fa20f69417326a92beac10ffa072db2616e73) * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/web/TestFSMainOperationsWebHdfs.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/WebHdfsFileSystem.java Allow reuse of NN connections via webhdfs - Key: HDFS-7224 URL: https://issues.apache.org/jira/browse/HDFS-7224 Project: Hadoop HDFS Issue Type: Bug Components: webhdfs Affects Versions: 2.5.0 Reporter: Eric Payne Assignee: Eric Payne Fix For: 2.7.0 Attachments: HDFS-7224.v1.201410301923.txt, HDFS-7224.v2.201410312033.txt, HDFS-7224.v3.txt, HDFS-7224.v4.txt In very large clusters, the webhdfs client could get bind exceptions because it runs out of ephemeral ports. This could happen when using webhdfs to talk to the NN in order to do list globbing of a huge amount of files. WebHdfsFileSystem#jsonParse gets the input/error stream from the connection, but never closes the stream. Since it's not closed, the JVM thinks the stream may still be transferring data, so the next time through this code, it has to get a new connection rather than reusing an existing one. The lack of connection reuse has poor latency and adds too much overhead to the NN. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-49) MiniDFSCluster.stopDataNode will always shut down a node in the cluster if a matching name is not found
[ https://issues.apache.org/jira/browse/HDFS-49?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14293572#comment-14293572 ] Hudson commented on HDFS-49: FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #83 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/83/]) HDFS-49. MiniDFSCluster.stopDataNode will always shut down a node in the cluster if a matching name is not found. (stevel) (stevel: rev 0da53a37ec46b887f441df98c6986b31fa7671a2) * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/MiniDFSCluster.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt MiniDFSCluster.stopDataNode will always shut down a node in the cluster if a matching name is not found --- Key: HDFS-49 URL: https://issues.apache.org/jira/browse/HDFS-49 Project: Hadoop HDFS Issue Type: Bug Components: test Affects Versions: 0.20.204.0, 0.20.205.0, 1.1.0 Reporter: Steve Loughran Assignee: Steve Loughran Priority: Minor Labels: codereview, newbie Fix For: 2.7.0 Attachments: HDFS-49-002.patch, hdfs-49.patch Original Estimate: 0.5h Remaining Estimate: 0.5h The stopDataNode method will shut down the last node in the list of nodes, if one matching a specific name is not found This is possibly not what was intended. Better to return false or fail in some other manner if the named node was not located synchronized boolean stopDataNode(String name) { int i; for (i = 0; i dataNodes.size(); i++) { DataNode dn = dataNodes.get(i).datanode; if (dn.dnRegistration.getName().equals(name)) { break; } } return stopDataNode(i); } -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-49) MiniDFSCluster.stopDataNode will always shut down a node in the cluster if a matching name is not found
[ https://issues.apache.org/jira/browse/HDFS-49?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14293581#comment-14293581 ] Hudson commented on HDFS-49: FAILURE: Integrated in Hadoop-Mapreduce-trunk #2037 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2037/]) HDFS-49. MiniDFSCluster.stopDataNode will always shut down a node in the cluster if a matching name is not found. (stevel) (stevel: rev 0da53a37ec46b887f441df98c6986b31fa7671a2) * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/MiniDFSCluster.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt MiniDFSCluster.stopDataNode will always shut down a node in the cluster if a matching name is not found --- Key: HDFS-49 URL: https://issues.apache.org/jira/browse/HDFS-49 Project: Hadoop HDFS Issue Type: Bug Components: test Affects Versions: 0.20.204.0, 0.20.205.0, 1.1.0 Reporter: Steve Loughran Assignee: Steve Loughran Priority: Minor Labels: codereview, newbie Fix For: 2.7.0 Attachments: HDFS-49-002.patch, hdfs-49.patch Original Estimate: 0.5h Remaining Estimate: 0.5h The stopDataNode method will shut down the last node in the list of nodes, if one matching a specific name is not found This is possibly not what was intended. Better to return false or fail in some other manner if the named node was not located synchronized boolean stopDataNode(String name) { int i; for (i = 0; i dataNodes.size(); i++) { DataNode dn = dataNodes.get(i).datanode; if (dn.dnRegistration.getName().equals(name)) { break; } } return stopDataNode(i); } -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7224) Allow reuse of NN connections via webhdfs
[ https://issues.apache.org/jira/browse/HDFS-7224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14293566#comment-14293566 ] Hudson commented on HDFS-7224: -- FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #83 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/83/]) HDFS-7224. Allow reuse of NN connections via webhdfs. Contributed by Eric Payne (kihwal: rev 2b0fa20f69417326a92beac10ffa072db2616e73) * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/web/TestFSMainOperationsWebHdfs.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/WebHdfsFileSystem.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt Allow reuse of NN connections via webhdfs - Key: HDFS-7224 URL: https://issues.apache.org/jira/browse/HDFS-7224 Project: Hadoop HDFS Issue Type: Bug Components: webhdfs Affects Versions: 2.5.0 Reporter: Eric Payne Assignee: Eric Payne Fix For: 2.7.0 Attachments: HDFS-7224.v1.201410301923.txt, HDFS-7224.v2.201410312033.txt, HDFS-7224.v3.txt, HDFS-7224.v4.txt In very large clusters, the webhdfs client could get bind exceptions because it runs out of ephemeral ports. This could happen when using webhdfs to talk to the NN in order to do list globbing of a huge amount of files. WebHdfsFileSystem#jsonParse gets the input/error stream from the connection, but never closes the stream. Since it's not closed, the JVM thinks the stream may still be transferring data, so the next time through this code, it has to get a new connection rather than reusing an existing one. The lack of connection reuse has poor latency and adds too much overhead to the NN. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7224) Allow reuse of NN connections via webhdfs
[ https://issues.apache.org/jira/browse/HDFS-7224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14293575#comment-14293575 ] Hudson commented on HDFS-7224: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk #2037 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2037/]) HDFS-7224. Allow reuse of NN connections via webhdfs. Contributed by Eric Payne (kihwal: rev 2b0fa20f69417326a92beac10ffa072db2616e73) * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/web/TestFSMainOperationsWebHdfs.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/WebHdfsFileSystem.java Allow reuse of NN connections via webhdfs - Key: HDFS-7224 URL: https://issues.apache.org/jira/browse/HDFS-7224 Project: Hadoop HDFS Issue Type: Bug Components: webhdfs Affects Versions: 2.5.0 Reporter: Eric Payne Assignee: Eric Payne Fix For: 2.7.0 Attachments: HDFS-7224.v1.201410301923.txt, HDFS-7224.v2.201410312033.txt, HDFS-7224.v3.txt, HDFS-7224.v4.txt In very large clusters, the webhdfs client could get bind exceptions because it runs out of ephemeral ports. This could happen when using webhdfs to talk to the NN in order to do list globbing of a huge amount of files. WebHdfsFileSystem#jsonParse gets the input/error stream from the connection, but never closes the stream. Since it's not closed, the JVM thinks the stream may still be transferring data, so the next time through this code, it has to get a new connection rather than reusing an existing one. The lack of connection reuse has poor latency and adds too much overhead to the NN. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HDFS-6571) NameNode should delete intermediate fsimage.ckpt when checkpoint fails
[ https://issues.apache.org/jira/browse/HDFS-6571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Charles Lamb reassigned HDFS-6571: -- Assignee: Charles Lamb NameNode should delete intermediate fsimage.ckpt when checkpoint fails -- Key: HDFS-6571 URL: https://issues.apache.org/jira/browse/HDFS-6571 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.4.0 Reporter: Akira AJISAKA Assignee: Charles Lamb When checkpoint fails in getting a new fsimage from standby NameNode or SecondaryNameNode, intermediate fsimage (fsimage.ckpt_txid) is left and never to be cleaned up. If fsimage is large and fails to checkpoint many times, the growing intermediate fsimage may cause out of disk space. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7175) Client-side SocketTimeoutException during Fsck
[ https://issues.apache.org/jira/browse/HDFS-7175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14294338#comment-14294338 ] Subbu commented on HDFS-7175: - I apologize for the delay in verification of this bug. I have now verified that no matter what the value is for frequency of flush, the solution does NOT work. Basically, the flush() call has no effect since there are no bytes to flush. Here is what I did to verify this: * Brought up a single node cluster. * I changed the frequency of flush to 1 (instead of 10k or 100k). * Ran fsck on a small directory with 10 files, both with and without -showprogress option. * Ran tcpdump on the namenode port to capture packets during the session. I could see that the dots were sent out in in the channel when -showprogress was specified, but the channel was quiet when it was not. So, we need to think of another way to solve the problem. Client-side SocketTimeoutException during Fsck -- Key: HDFS-7175 URL: https://issues.apache.org/jira/browse/HDFS-7175 Project: Hadoop HDFS Issue Type: Bug Components: namenode Reporter: Carl Steinbach Assignee: Akira AJISAKA Attachments: HDFS-7175.2.patch, HDFS-7175.3.patch, HDFS-7175.patch, HDFS-7175.patch HDFS-2538 disabled status reporting for the fsck command (it can optionally be enabled with the -showprogress option). We have observed that without status reporting the client will abort with read timeout: {noformat} [hdfs@lva1-hcl0030 ~]$ hdfs fsck / Connecting to namenode via http://lva1-tarocknn01.grid.linkedin.com:50070 14/09/30 06:03:41 WARN security.UserGroupInformation: PriviledgedActionException as:h...@grid.linkedin.com (auth:KERBEROS) cause:java.net.SocketTimeoutException: Read timed out Exception in thread main java.net.SocketTimeoutException: Read timed out at java.net.SocketInputStream.socketRead0(Native Method) at java.net.SocketInputStream.read(SocketInputStream.java:152) at java.net.SocketInputStream.read(SocketInputStream.java:122) at java.io.BufferedInputStream.fill(BufferedInputStream.java:235) at java.io.BufferedInputStream.read1(BufferedInputStream.java:275) at java.io.BufferedInputStream.read(BufferedInputStream.java:334) at sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:687) at sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:633) at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1323) at org.apache.hadoop.hdfs.tools.DFSck.doWork(DFSck.java:312) at org.apache.hadoop.hdfs.tools.DFSck.access$000(DFSck.java:72) at org.apache.hadoop.hdfs.tools.DFSck$1.run(DFSck.java:149) at org.apache.hadoop.hdfs.tools.DFSck$1.run(DFSck.java:146) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548) at org.apache.hadoop.hdfs.tools.DFSck.run(DFSck.java:145) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84) at org.apache.hadoop.hdfs.tools.DFSck.main(DFSck.java:346) {noformat} Since there's nothing for the client to read it will abort if the time required to complete the fsck operation is longer than the client's read timeout setting. I can think of a couple ways to fix this: # Set an infinite read timeout on the client side (not a good idea!). # Have the server-side write (and flush) zeros to the wire and instruct the client to ignore these characters instead of echoing them. # It's possible that flushing an empty buffer on the server-side will trigger an HTTP response with a zero length payload. This may be enough to keep the client from hanging up. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7566) Remove obsolete entries from hdfs-default.xml
[ https://issues.apache.org/jira/browse/HDFS-7566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14294337#comment-14294337 ] Hudson commented on HDFS-7566: -- SUCCESS: Integrated in Hadoop-trunk-Commit #6944 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/6944/]) HDFS-7566. Remove obsolete entries from hdfs-default.xml (Ray Chiang via aw) (aw: rev 0a05ae1782488597cbf8667866f98f0df341abc0) * hadoop-tools/hadoop-sls/src/main/data/2jobs2min-rumen-jh.json * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/test/resources/job_1329348432655_0001_conf.xml * hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt Remove obsolete entries from hdfs-default.xml - Key: HDFS-7566 URL: https://issues.apache.org/jira/browse/HDFS-7566 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.6.0 Reporter: Ray Chiang Assignee: Ray Chiang Labels: supportability Fix For: 2.7.0 Attachments: HDFS-7566.001.patch So far, I've found these five properties which may be obsolete in hdfs-default.xml: - dfs.https.enable - dfs.namenode.edits.journal-plugin.qjournal - dfs.namenode.logging.level - dfs.ha.namenodes.EXAMPLENAMESERVICE + Should this be kept in the .xml file? - dfs.support.append + Removed with HDFS-6246 I'd like to get feedback about the state of any of the above properties. This is the HDFS equivalent of MAPREDUCE-6057 and YARN-2460. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7611) deleteSnapshot and delete of a file can leave orphaned blocks in the blocksMap on NameNode restart.
[ https://issues.apache.org/jira/browse/HDFS-7611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14294336#comment-14294336 ] Jing Zhao commented on HDFS-7611: - I cannot reproduce the failure of TestOpenFilesWithSnapshot in my local environment. And the new Jenkins run did not complain it. Looks like this is intermittent. From the log of the previous run looks like the DNs could not finish registration and block report during {{waitClusterUp}}. Maybe this is related to 1) slow environment, and 2) the client was still trying to write data to DNs in the test thus triggered IBR which delayed the process further. We can open a separate jira to track this if you think it's necessary. deleteSnapshot and delete of a file can leave orphaned blocks in the blocksMap on NameNode restart. --- Key: HDFS-7611 URL: https://issues.apache.org/jira/browse/HDFS-7611 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.6.0 Reporter: Konstantin Shvachko Assignee: Byron Wong Priority: Critical Attachments: HDFS-7611.000.patch, HDFS-7611.001.patch, HDFS-7611.002.patch, blocksNotDeletedTest.patch, testTruncateEditLogLoad.log If quotas are enabled a combination of operations *deleteSnapshot* and *delete* of a file can leave orphaned blocks in the blocksMap on NameNode restart. They are counted as missing on the NameNode, and can prevent NameNode from coming out of safeMode and could cause memory leak during startup. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6673) Add Delimited format supports for PB OIV tool
[ https://issues.apache.org/jira/browse/HDFS-6673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14294375#comment-14294375 ] Lei (Eddy) Xu commented on HDFS-6673: - [~andrew.wang], [~wheat9] and I had an offline call about this issue. We discovered two different use cases: # User downloads fsimage to his laptop and runs this PB OIV tool # User runs PB OIV tool as a MapReduce task. [~wheat9] raised the concern that when the working set (directory file names mapping and inode to parent inode mapping) is larger than memory, it is hard to expect the execution time for running OIV in MapReduce task, because usually such tasks are running on DN with relatively smaller memory and HDDs, and random seeks in LevelDB might kill the performance. He suggested rather than make the MR task unexpected long, it would be better to let the task failed faster. We think it would be better to use the {{InMemoryMap}} here to store metadata in memory for MR task, so that if the working set is too large, the MR task will out of memory and die fast. So we can suggest user to run this task on a larger memory machine. On the other hand, for case #1, user can leverage laptop's SSD to get decent performance for such large fsimage, without requiring large memory. In summary, we suggest to use the PB OIV tool as following: * For very small fsimage (e.g., 1GB) or very large fsimage on the machine with HDD and limited RAM (e.g., 40+GB fsimage vs 8GB RAM), it should use {{InMemoryMap}}, by not specifying {{--tempdb}} parameter. Users are suggested to run it in very large RAM. * Other than that, user can use {{--tempdb}} to specify a path to use LevelDB to store metadata out of heap. [~wheat9] and [~andrew.wang] does the above cover all the information we have discussed? [~wheat9] Can I get a +0 from you? Add Delimited format supports for PB OIV tool - Key: HDFS-6673 URL: https://issues.apache.org/jira/browse/HDFS-6673 Project: Hadoop HDFS Issue Type: Sub-task Affects Versions: 2.4.0 Reporter: Lei (Eddy) Xu Assignee: Lei (Eddy) Xu Priority: Minor Attachments: HDFS-6673.000.patch, HDFS-6673.001.patch, HDFS-6673.002.patch, HDFS-6673.003.patch, HDFS-6673.004.patch, HDFS-6673.005.patch, HDFS-6673.006.patch The new oiv tool, which is designed for Protobuf fsimage, lacks a few features supported in the old {{oiv}} tool. This task adds supports of _Delimited_ processor to the oiv tool. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-5420) Not need to launch Secondary namenode for NN HA mode?
[ https://issues.apache.org/jira/browse/HDFS-5420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14294374#comment-14294374 ] Hadoop QA commented on HDFS-5420: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12694825/HDFS-5420-00.patch against trunk revision 1e2d98a. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.server.namenode.snapshot.TestOpenFilesWithSnapshot Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/9349//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9349//console This message is automatically generated. Not need to launch Secondary namenode for NN HA mode? - Key: HDFS-5420 URL: https://issues.apache.org/jira/browse/HDFS-5420 Project: Hadoop HDFS Issue Type: Bug Components: scripts Affects Versions: 3.0.0 Reporter: Raymond Liu Priority: Minor Attachments: HDFS-5420-00.patch For Hadoop 2, When deploying with NN HA, the wiki says that it is an error to start a secondary namenode. While the sbin/start-dfs.sh still launch a secondary namenode even nothing related to the secondary namenode is configured. Should this be fixed? or people just don't use this scripts to start HA hdfs? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7559) Create unit test to automatically compare HDFS related classes and hdfs-default.xml
[ https://issues.apache.org/jira/browse/HDFS-7559?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ray Chiang updated HDFS-7559: - Status: Patch Available (was: Open) HDFS-7566 committed. Submitting patch with .xml - Java configuration properties testing turned on. Create unit test to automatically compare HDFS related classes and hdfs-default.xml --- Key: HDFS-7559 URL: https://issues.apache.org/jira/browse/HDFS-7559 Project: Hadoop HDFS Issue Type: Improvement Reporter: Ray Chiang Assignee: Ray Chiang Priority: Minor Labels: supportability Attachments: HDFS-7559.001.patch, HDFS-7559.002.patch Create a unit test that will automatically compare the fields in the various HDFS related classes and hdfs-default.xml. It should throw an error if a property is missing in either the class or the file. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7175) Client-side SocketTimeoutException during Fsck
[ https://issues.apache.org/jira/browse/HDFS-7175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14294442#comment-14294442 ] Subbu commented on HDFS-7175: - One way to fix this may be to put out the . on the server even if -showprogress is not specified, and then filter it out in the client (if the option is not specified). Seems like a hacky solution, though. Client-side SocketTimeoutException during Fsck -- Key: HDFS-7175 URL: https://issues.apache.org/jira/browse/HDFS-7175 Project: Hadoop HDFS Issue Type: Bug Components: namenode Reporter: Carl Steinbach Assignee: Akira AJISAKA Attachments: HDFS-7175.2.patch, HDFS-7175.3.patch, HDFS-7175.patch, HDFS-7175.patch HDFS-2538 disabled status reporting for the fsck command (it can optionally be enabled with the -showprogress option). We have observed that without status reporting the client will abort with read timeout: {noformat} [hdfs@lva1-hcl0030 ~]$ hdfs fsck / Connecting to namenode via http://lva1-tarocknn01.grid.linkedin.com:50070 14/09/30 06:03:41 WARN security.UserGroupInformation: PriviledgedActionException as:h...@grid.linkedin.com (auth:KERBEROS) cause:java.net.SocketTimeoutException: Read timed out Exception in thread main java.net.SocketTimeoutException: Read timed out at java.net.SocketInputStream.socketRead0(Native Method) at java.net.SocketInputStream.read(SocketInputStream.java:152) at java.net.SocketInputStream.read(SocketInputStream.java:122) at java.io.BufferedInputStream.fill(BufferedInputStream.java:235) at java.io.BufferedInputStream.read1(BufferedInputStream.java:275) at java.io.BufferedInputStream.read(BufferedInputStream.java:334) at sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:687) at sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:633) at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1323) at org.apache.hadoop.hdfs.tools.DFSck.doWork(DFSck.java:312) at org.apache.hadoop.hdfs.tools.DFSck.access$000(DFSck.java:72) at org.apache.hadoop.hdfs.tools.DFSck$1.run(DFSck.java:149) at org.apache.hadoop.hdfs.tools.DFSck$1.run(DFSck.java:146) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548) at org.apache.hadoop.hdfs.tools.DFSck.run(DFSck.java:145) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84) at org.apache.hadoop.hdfs.tools.DFSck.main(DFSck.java:346) {noformat} Since there's nothing for the client to read it will abort if the time required to complete the fsck operation is longer than the client's read timeout setting. I can think of a couple ways to fix this: # Set an infinite read timeout on the client side (not a good idea!). # Have the server-side write (and flush) zeros to the wire and instruct the client to ignore these characters instead of echoing them. # It's possible that flushing an empty buffer on the server-side will trigger an HTTP response with a zero length payload. This may be enough to keep the client from hanging up. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7686) Corrupt block reporting to namenode soon feature is overwritten by HDFS-7430
[ https://issues.apache.org/jira/browse/HDFS-7686?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rushabh S Shah updated HDFS-7686: - Priority: Blocker (was: Major) Corrupt block reporting to namenode soon feature is overwritten by HDFS-7430 --- Key: HDFS-7686 URL: https://issues.apache.org/jira/browse/HDFS-7686 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 3.0.0 Reporter: Rushabh S Shah Priority: Blocker The feature implemented in HDFS-7548 is removed by HDFS-7430. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7566) Remove obsolete entries from hdfs-default.xml
[ https://issues.apache.org/jira/browse/HDFS-7566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allen Wittenauer updated HDFS-7566: --- Resolution: Fixed Fix Version/s: 2.7.0 Status: Resolved (was: Patch Available) +1 committed to trunk and branch-2 Thanks! Remove obsolete entries from hdfs-default.xml - Key: HDFS-7566 URL: https://issues.apache.org/jira/browse/HDFS-7566 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.6.0 Reporter: Ray Chiang Assignee: Ray Chiang Labels: supportability Fix For: 2.7.0 Attachments: HDFS-7566.001.patch So far, I've found these five properties which may be obsolete in hdfs-default.xml: - dfs.https.enable - dfs.namenode.edits.journal-plugin.qjournal - dfs.namenode.logging.level - dfs.ha.namenodes.EXAMPLENAMESERVICE + Should this be kept in the .xml file? - dfs.support.append + Removed with HDFS-6246 I'd like to get feedback about the state of any of the above properties. This is the HDFS equivalent of MAPREDUCE-6057 and YARN-2460. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7411) Refactor and improve decommissioning logic into DecommissionManager
[ https://issues.apache.org/jira/browse/HDFS-7411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14294345#comment-14294345 ] Andrew Wang commented on HDFS-7411: --- Hi Nicholas, Those methods you mentioned are pretty small, and are far from the bulk of the patch. Would really prefer not to patch split at this point, since it's a lot of work. Two other reviewers have also made it through this successfully patch, so I don't think it's so bad to review. Regarding the removed config property, this is something discussed above. I don't see a way of deprecating this gracefully, since the units of the old and new config properties are incompatible. We needed to change the units because limiting by # of nodes is quite erratic. Since it is quite erratic, I don't think there's much harm in just removing the old config. It doesn't do much right now, and with the patch it'll do nothing. I can move the test method over as you suggested, but would prefer to get clarity on the above before posting another patch rev. Thanks. Refactor and improve decommissioning logic into DecommissionManager --- Key: HDFS-7411 URL: https://issues.apache.org/jira/browse/HDFS-7411 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 2.5.1 Reporter: Andrew Wang Assignee: Andrew Wang Attachments: hdfs-7411.001.patch, hdfs-7411.002.patch, hdfs-7411.003.patch, hdfs-7411.004.patch, hdfs-7411.005.patch, hdfs-7411.006.patch, hdfs-7411.007.patch, hdfs-7411.008.patch, hdfs-7411.009.patch Would be nice to split out decommission logic from DatanodeManager to DecommissionManager. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7566) Remove obsolete entries from hdfs-default.xml
[ https://issues.apache.org/jira/browse/HDFS-7566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14294430#comment-14294430 ] Ray Chiang commented on HDFS-7566: -- Thanks for the review and commit! Remove obsolete entries from hdfs-default.xml - Key: HDFS-7566 URL: https://issues.apache.org/jira/browse/HDFS-7566 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.6.0 Reporter: Ray Chiang Assignee: Ray Chiang Labels: supportability Fix For: 2.7.0 Attachments: HDFS-7566.001.patch So far, I've found these five properties which may be obsolete in hdfs-default.xml: - dfs.https.enable - dfs.namenode.edits.journal-plugin.qjournal - dfs.namenode.logging.level - dfs.ha.namenodes.EXAMPLENAMESERVICE + Should this be kept in the .xml file? - dfs.support.append + Removed with HDFS-6246 I'd like to get feedback about the state of any of the above properties. This is the HDFS equivalent of MAPREDUCE-6057 and YARN-2460. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-7687) Change fsck to support EC files
Tsz Wo Nicholas Sze created HDFS-7687: - Summary: Change fsck to support EC files Key: HDFS-7687 URL: https://issues.apache.org/jira/browse/HDFS-7687 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Tsz Wo Nicholas Sze Assignee: Tsz Wo Nicholas Sze We need to change fsck so that it can detect under replicated and corrupted EC files. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7376) Upgrade jsch lib to jsch-0.1.51 to avoid problems running on java7
[ https://issues.apache.org/jira/browse/HDFS-7376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14293301#comment-14293301 ] Steve Loughran commented on HDFS-7376: -- do you have a patch for this? Upgrade jsch lib to jsch-0.1.51 to avoid problems running on java7 -- Key: HDFS-7376 URL: https://issues.apache.org/jira/browse/HDFS-7376 Project: Hadoop HDFS Issue Type: Bug Components: build Reporter: Johannes Zillmann We had an application sitting on top of Hadoop and got problems using jsch once we switched to java 7. Got this exception: {noformat} com.jcraft.jsch.JSchException: verify: false at com.jcraft.jsch.Session.connect(Session.java:330) at com.jcraft.jsch.Session.connect(Session.java:183) {noformat} Upgrading to jsch-0.1.51 from jsch-0.1.49 fixed the issue for us, but then it got in conflict with hadoop's jsch version (we fixed this for us by jarjar'ing our jsch version). So i think jsch got introduce by namenode HA (HDFS-1623). So you guys should check if the ssh part is properly working for java7 or preventively upgrade the jsch lib to jsch-0.1.51! Some references to problems reported: - http://sourceforge.net/p/jsch/mailman/jsch-users/thread/loom.20131009t211650-...@post.gmane.org/ - https://issues.apache.org/bugzilla/show_bug.cgi?id=53437 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7224) Allow reuse of NN connections via webhdfs
[ https://issues.apache.org/jira/browse/HDFS-7224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14293313#comment-14293313 ] Hudson commented on HDFS-7224: -- FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #86 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/86/]) HDFS-7224. Allow reuse of NN connections via webhdfs. Contributed by Eric Payne (kihwal: rev 2b0fa20f69417326a92beac10ffa072db2616e73) * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/WebHdfsFileSystem.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/web/TestFSMainOperationsWebHdfs.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt Allow reuse of NN connections via webhdfs - Key: HDFS-7224 URL: https://issues.apache.org/jira/browse/HDFS-7224 Project: Hadoop HDFS Issue Type: Bug Components: webhdfs Affects Versions: 2.5.0 Reporter: Eric Payne Assignee: Eric Payne Fix For: 2.7.0 Attachments: HDFS-7224.v1.201410301923.txt, HDFS-7224.v2.201410312033.txt, HDFS-7224.v3.txt, HDFS-7224.v4.txt In very large clusters, the webhdfs client could get bind exceptions because it runs out of ephemeral ports. This could happen when using webhdfs to talk to the NN in order to do list globbing of a huge amount of files. WebHdfsFileSystem#jsonParse gets the input/error stream from the connection, but never closes the stream. Since it's not closed, the JVM thinks the stream may still be transferring data, so the next time through this code, it has to get a new connection rather than reusing an existing one. The lack of connection reuse has poor latency and adds too much overhead to the NN. -- This message was sent by Atlassian JIRA (v6.3.4#6332)