[jira] [Commented] (HDFS-8695) OzoneHandler : Add Bucket REST Interface
[ https://issues.apache.org/jira/browse/HDFS-8695?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14642579#comment-14642579 ] kanaka kumar avvaru commented on HDFS-8695: --- Thanks [~anu] for updating the patch and details on the service API. Small nit: {{VolumeHandler.java}}, you may want to remove this file in patch as there is no code change done. Below comment is not related to this JIRA but you may want to consider during protocol document updates OR handle separately if my point is correct. {{VolumeHandler#getVolumeInfo}} requires volume name as parameter. i.e. A URL {{/IP:PORT/ozone/volume1?info=service}} which lists all volume names doesn't look good. I feel list of volumes service should be another method which should be mapped to root path of Ozone as volumes are the top containers of Ozone. FYI, In S3 the list of the buckets are handled on the root of the resource path {{http://docs.aws.amazon.com/AmazonS3/latest/API/RESTServiceGET.html}} OzoneHandler : Add Bucket REST Interface Key: HDFS-8695 URL: https://issues.apache.org/jira/browse/HDFS-8695 Project: Hadoop HDFS Issue Type: Sub-task Components: ozone Reporter: Anu Engineer Assignee: Anu Engineer Attachments: hdfs-8695-HDFS-7240.001.patch, hdfs-8695-HDFS-7240.002.patch Add Bucket REST interface into Ozone server. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8810) Correct assertions in TestDFSInotifyEventInputStream class.
[ https://issues.apache.org/jira/browse/HDFS-8810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14642624#comment-14642624 ] Hudson commented on HDFS-8810: -- SUCCESS: Integrated in Hadoop-Yarn-trunk #999 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/999/]) HDFS-8810. Correct assertions in TestDFSInotifyEventInputStream class. Contributed by Surendra Singh Lilhore. (aajisaka: rev 1df78688c69476f89d16f93bc74a4f05d0b1a3da) * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSInotifyEventInputStream.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt Correct assertions in TestDFSInotifyEventInputStream class. --- Key: HDFS-8810 URL: https://issues.apache.org/jira/browse/HDFS-8810 Project: Hadoop HDFS Issue Type: Bug Components: test Affects Versions: 2.6.0, 2.7.1 Reporter: Surendra Singh Lilhore Assignee: Surendra Singh Lilhore Priority: Minor Fix For: 2.8.0 Attachments: HDFS-8810.patch Some assertions are wrong in the test. Here is an example. {code} Assert.assertTrue(batch.getEvents()[0].getEventType() == Event.EventType.RENAME); Event.RenameEvent re3 = (Event.RenameEvent) batch.getEvents()[0]; Assert.assertTrue(re3.getDstPath().equals(/dir/file5)); Assert.assertTrue(re3.getSrcPath().equals(/file5)); Assert.assertTrue(re.getTimestamp() 0); {code} https://issues.apache.org/jira/browse/HDFS-8735?focusedCommentId=14636571page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14636571 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8810) Correct assertions in TestDFSInotifyEventInputStream class.
[ https://issues.apache.org/jira/browse/HDFS-8810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14642614#comment-14642614 ] Hudson commented on HDFS-8810: -- SUCCESS: Integrated in Hadoop-Yarn-trunk-Java8 #269 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/269/]) HDFS-8810. Correct assertions in TestDFSInotifyEventInputStream class. Contributed by Surendra Singh Lilhore. (aajisaka: rev 1df78688c69476f89d16f93bc74a4f05d0b1a3da) * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSInotifyEventInputStream.java Correct assertions in TestDFSInotifyEventInputStream class. --- Key: HDFS-8810 URL: https://issues.apache.org/jira/browse/HDFS-8810 Project: Hadoop HDFS Issue Type: Bug Components: test Affects Versions: 2.6.0, 2.7.1 Reporter: Surendra Singh Lilhore Assignee: Surendra Singh Lilhore Priority: Minor Fix For: 2.8.0 Attachments: HDFS-8810.patch Some assertions are wrong in the test. Here is an example. {code} Assert.assertTrue(batch.getEvents()[0].getEventType() == Event.EventType.RENAME); Event.RenameEvent re3 = (Event.RenameEvent) batch.getEvents()[0]; Assert.assertTrue(re3.getDstPath().equals(/dir/file5)); Assert.assertTrue(re3.getSrcPath().equals(/file5)); Assert.assertTrue(re.getTimestamp() 0); {code} https://issues.apache.org/jira/browse/HDFS-8735?focusedCommentId=14636571page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14636571 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8799) Erasure Coding: add tests for namenode processing corrupt striped blocks
[ https://issues.apache.org/jira/browse/HDFS-8799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14642686#comment-14642686 ] Takanobu Asanuma commented on HDFS-8799: Thanks for the great work, [~walter.k.su]! I read and ran your patch. It looks good to me(non-binding). And I was able to get familiar with recovery work about ec. And I did some test with your patch. If we want to recover all ec blocks, we need at least nine datanodes which don't have the corrupt blocks, right? So if there are 3 corrupt ec blocks in one ec block group, we need at least 12 datanodes to recover the all ec blocks. This is not unintuitive. How about add this case? Erasure Coding: add tests for namenode processing corrupt striped blocks Key: HDFS-8799 URL: https://issues.apache.org/jira/browse/HDFS-8799 Project: Hadoop HDFS Issue Type: Sub-task Components: test Reporter: Walter Su Assignee: Walter Su Priority: Minor Attachments: HDFS-8799-HDFS-7285.01.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7788) Post-2.6 namenode may not start up with an image containing inodes created with an old release.
[ https://issues.apache.org/jira/browse/HDFS-7788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14642823#comment-14642823 ] Sangjin Lee commented on HDFS-7788: --- The backport to 2.6.0 is pretty trivial but the test FS image needs to be recreated for 2.6.0. Also it'd be good to rename the variable (HADOOP_2_7_ZER0_BLOCK_SIZE_TGZ) in TestFSImage for 2.6. I'll post a suggested backport patch for 2.6.0. Post-2.6 namenode may not start up with an image containing inodes created with an old release. --- Key: HDFS-7788 URL: https://issues.apache.org/jira/browse/HDFS-7788 Project: Hadoop HDFS Issue Type: Bug Reporter: Kihwal Lee Assignee: Rushabh S Shah Priority: Blocker Labels: 2.6.1-candidate Fix For: 2.7.0 Attachments: HDFS-7788-binary.patch, rushabh.patch Before HDFS-4305, which was fixed in 2.1.0-beta, clients could specify arbitrarily small preferred block size for a file including 0. This was normally done by faulty clients or failed creates, but it was possible. Until 2.5, reading a fsimage containing inodes with 0 byte preferred block size was allowed. So if a fsimage contained such an inode, the namenode would come up fine. In 2.6, the preferred block size is required be 0. Because of this change, the image that worked with 2.5 may not work with 2.6. If a cluster ran a version of hadoop earlier than 2.1.0-beta before, it is under this risk even if it worked fine with 2.5. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8785) TestDistributedFileSystem is failing in trunk
[ https://issues.apache.org/jira/browse/HDFS-8785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14642815#comment-14642815 ] Hudson commented on HDFS-8785: -- FAILURE: Integrated in Hadoop-trunk-Commit #8225 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/8225/]) HDFS-8785. TestDistributedFileSystem is failing in trunk. Contributed by Xiaoyu Yao. (xyao: rev 2196e39e142b0f8d1944805db2bfacd4e3244625) * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDistributedFileSystem.java TestDistributedFileSystem is failing in trunk - Key: HDFS-8785 URL: https://issues.apache.org/jira/browse/HDFS-8785 Project: Hadoop HDFS Issue Type: Bug Components: test Affects Versions: 3.0.0, 2.8.0 Reporter: Arpit Agarwal Assignee: Xiaoyu Yao Fix For: 2.8.0 Attachments: HDFS-8785.00.patch, HDFS-8785.01.patch, HDFS-8785.02.patch A newly added test case {{TestDistributedFileSystem#testDFSClientPeerWriteTimeout}} is failing in trunk. e.g. run https://builds.apache.org/job/PreCommit-HDFS-Build/11716/testReport/org.apache.hadoop.hdfs/TestDistributedFileSystem/testDFSClientPeerWriteTimeout/ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7930) commitBlockSynchronization() does not remove locations
[ https://issues.apache.org/jira/browse/HDFS-7930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sangjin Lee updated HDFS-7930: -- Labels: 2.6.1-candidate (was: ) commitBlockSynchronization() does not remove locations -- Key: HDFS-7930 URL: https://issues.apache.org/jira/browse/HDFS-7930 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.7.0 Reporter: Konstantin Shvachko Assignee: Yi Liu Priority: Blocker Labels: 2.6.1-candidate Fix For: 2.7.0 Attachments: HDFS-7930.001.patch, HDFS-7930.002.patch, HDFS-7930.003.patch When {{commitBlockSynchronization()}} has less {{newTargets}} than in the original block it does not remove unconfirmed locations. This results in that the the block stores locations of different lengths or genStamp (corrupt). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7788) Post-2.6 namenode may not start up with an image containing inodes created with an old release.
[ https://issues.apache.org/jira/browse/HDFS-7788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sangjin Lee updated HDFS-7788: -- Attachment: HDFS-7788-2.6.0.patch Post-2.6 namenode may not start up with an image containing inodes created with an old release. --- Key: HDFS-7788 URL: https://issues.apache.org/jira/browse/HDFS-7788 Project: Hadoop HDFS Issue Type: Bug Reporter: Kihwal Lee Assignee: Rushabh S Shah Priority: Blocker Labels: 2.6.1-candidate Fix For: 2.7.0 Attachments: HDFS-7788-2.6.0.patch, HDFS-7788-binary.patch, rushabh.patch Before HDFS-4305, which was fixed in 2.1.0-beta, clients could specify arbitrarily small preferred block size for a file including 0. This was normally done by faulty clients or failed creates, but it was possible. Until 2.5, reading a fsimage containing inodes with 0 byte preferred block size was allowed. So if a fsimage contained such an inode, the namenode would come up fine. In 2.6, the preferred block size is required be 0. Because of this change, the image that worked with 2.5 may not work with 2.6. If a cluster ran a version of hadoop earlier than 2.1.0-beta before, it is under this risk even if it worked fine with 2.5. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8810) Correct assertions in TestDFSInotifyEventInputStream class.
[ https://issues.apache.org/jira/browse/HDFS-8810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14642854#comment-14642854 ] Hudson commented on HDFS-8810: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #266 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/266/]) HDFS-8810. Correct assertions in TestDFSInotifyEventInputStream class. Contributed by Surendra Singh Lilhore. (aajisaka: rev 1df78688c69476f89d16f93bc74a4f05d0b1a3da) * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSInotifyEventInputStream.java Correct assertions in TestDFSInotifyEventInputStream class. --- Key: HDFS-8810 URL: https://issues.apache.org/jira/browse/HDFS-8810 Project: Hadoop HDFS Issue Type: Bug Components: test Affects Versions: 2.6.0, 2.7.1 Reporter: Surendra Singh Lilhore Assignee: Surendra Singh Lilhore Priority: Minor Fix For: 2.8.0 Attachments: HDFS-8810.patch Some assertions are wrong in the test. Here is an example. {code} Assert.assertTrue(batch.getEvents()[0].getEventType() == Event.EventType.RENAME); Event.RenameEvent re3 = (Event.RenameEvent) batch.getEvents()[0]; Assert.assertTrue(re3.getDstPath().equals(/dir/file5)); Assert.assertTrue(re3.getSrcPath().equals(/file5)); Assert.assertTrue(re.getTimestamp() 0); {code} https://issues.apache.org/jira/browse/HDFS-8735?focusedCommentId=14636571page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14636571 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7788) Post-2.6 namenode may not start up with an image containing inodes created with an old release.
[ https://issues.apache.org/jira/browse/HDFS-7788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sangjin Lee updated HDFS-7788: -- Attachment: image-with-zero-block-size.tar.gz Post-2.6 namenode may not start up with an image containing inodes created with an old release. --- Key: HDFS-7788 URL: https://issues.apache.org/jira/browse/HDFS-7788 Project: Hadoop HDFS Issue Type: Bug Reporter: Kihwal Lee Assignee: Rushabh S Shah Priority: Blocker Labels: 2.6.1-candidate Fix For: 2.7.0 Attachments: HDFS-7788-2.6.0.patch, HDFS-7788-binary.patch, image-with-zero-block-size.tar.gz, rushabh.patch Before HDFS-4305, which was fixed in 2.1.0-beta, clients could specify arbitrarily small preferred block size for a file including 0. This was normally done by faulty clients or failed creates, but it was possible. Until 2.5, reading a fsimage containing inodes with 0 byte preferred block size was allowed. So if a fsimage contained such an inode, the namenode would come up fine. In 2.6, the preferred block size is required be 0. Because of this change, the image that worked with 2.5 may not work with 2.6. If a cluster ran a version of hadoop earlier than 2.1.0-beta before, it is under this risk even if it worked fine with 2.5. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7788) Post-2.6 namenode may not start up with an image containing inodes created with an old release.
[ https://issues.apache.org/jira/browse/HDFS-7788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sangjin Lee updated HDFS-7788: -- Labels: 2.6.1-candidate (was: ) Post-2.6 namenode may not start up with an image containing inodes created with an old release. --- Key: HDFS-7788 URL: https://issues.apache.org/jira/browse/HDFS-7788 Project: Hadoop HDFS Issue Type: Bug Reporter: Kihwal Lee Assignee: Rushabh S Shah Priority: Blocker Labels: 2.6.1-candidate Fix For: 2.7.0 Attachments: HDFS-7788-binary.patch, rushabh.patch Before HDFS-4305, which was fixed in 2.1.0-beta, clients could specify arbitrarily small preferred block size for a file including 0. This was normally done by faulty clients or failed creates, but it was possible. Until 2.5, reading a fsimage containing inodes with 0 byte preferred block size was allowed. So if a fsimage contained such an inode, the namenode would come up fine. In 2.6, the preferred block size is required be 0. Because of this change, the image that worked with 2.5 may not work with 2.6. If a cluster ran a version of hadoop earlier than 2.1.0-beta before, it is under this risk even if it worked fine with 2.5. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8399) Erasure Coding: unit test the behaviour of BlockManager recovery work for the deleted blocks
[ https://issues.apache.org/jira/browse/HDFS-8399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rakesh R updated HDFS-8399: --- Summary: Erasure Coding: unit test the behaviour of BlockManager recovery work for the deleted blocks (was: Erasure Coding: BlockManager is unnecessarily computing recovery work for the deleted blocks) Erasure Coding: unit test the behaviour of BlockManager recovery work for the deleted blocks Key: HDFS-8399 URL: https://issues.apache.org/jira/browse/HDFS-8399 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Rakesh R Assignee: Rakesh R Labels: Test Attachments: HDFS-8399-HDFS-7285-00.patch, HDFS-8399-HDFS-7285-01.patch Following exception occurred in the {{ReplicationMonitor}}. As per the initial analysis, I could see the exception is coming for the blocks of the deleted file. {code} 2015-05-14 14:14:40,485 FATAL util.ExitUtil (ExitUtil.java:terminate(127)) - Terminate called org.apache.hadoop.util.ExitUtil$ExitException: java.lang.AssertionError: Absolute path required at org.apache.hadoop.hdfs.server.namenode.INode.getPathNames(INode.java:744) at org.apache.hadoop.hdfs.server.namenode.INode.getPathComponents(INode.java:723) at org.apache.hadoop.hdfs.server.namenode.FSDirectory.getINodesInPath(FSDirectory.java:1655) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getECSchemaForPath(FSNamesystem.java:8435) at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.computeRecoveryWorkForBlocks(BlockManager.java:1572) at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.computeBlockRecoveryWork(BlockManager.java:1402) at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.computeDatanodeWork(BlockManager.java:3894) at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$ReplicationMonitor.run(BlockManager.java:3846) at java.lang.Thread.run(Thread.java:722) at org.apache.hadoop.util.ExitUtil.terminate(ExitUtil.java:126) at org.apache.hadoop.util.ExitUtil.terminate(ExitUtil.java:170) at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$ReplicationMonitor.run(BlockManager.java:3865) at java.lang.Thread.run(Thread.java:722) Exception in thread org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$ReplicationMonitor@1255079 org.apache.hadoop.util.ExitUtil$ExitException: java.lang.AssertionError: Absolute path required at org.apache.hadoop.hdfs.server.namenode.INode.getPathNames(INode.java:744) at org.apache.hadoop.hdfs.server.namenode.INode.getPathComponents(INode.java:723) at org.apache.hadoop.hdfs.server.namenode.FSDirectory.getINodesInPath(FSDirectory.java:1655) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getECSchemaForPath(FSNamesystem.java:8435) at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.computeRecoveryWorkForBlocks(BlockManager.java:1572) at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.computeBlockRecoveryWork(BlockManager.java:1402) at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.computeDatanodeWork(BlockManager.java:3894) at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$ReplicationMonitor.run(BlockManager.java:3846) at java.lang.Thread.run(Thread.java:722) at org.apache.hadoop.util.ExitUtil.terminate(ExitUtil.java:126) at org.apache.hadoop.util.ExitUtil.terminate(ExitUtil.java:170) at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$ReplicationMonitor.run(BlockManager.java:3865) at java.lang.Thread.run(Thread.java:722) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7884) NullPointerException in BlockSender
[ https://issues.apache.org/jira/browse/HDFS-7884?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sangjin Lee updated HDFS-7884: -- Labels: 2.6.1-candidate (was: ) NullPointerException in BlockSender --- Key: HDFS-7884 URL: https://issues.apache.org/jira/browse/HDFS-7884 Project: Hadoop HDFS Issue Type: Bug Components: datanode Reporter: Tsz Wo Nicholas Sze Assignee: Brahma Reddy Battula Priority: Blocker Labels: 2.6.1-candidate Fix For: 2.7.0 Attachments: HDFS-7884-002.patch, HDFS-7884.patch, h7884_20150313.patch, org.apache.hadoop.hdfs.TestAppendSnapshotTruncate-output.txt {noformat} java.lang.NullPointerException at org.apache.hadoop.hdfs.server.datanode.BlockSender.init(BlockSender.java:264) at org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:506) at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opReadBlock(Receiver.java:116) at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:71) at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:249) at java.lang.Thread.run(Thread.java:745) {noformat} BlockSender.java:264 is shown below {code} this.volumeRef = datanode.data.getVolume(block).obtainReference(); {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7884) NullPointerException in BlockSender
[ https://issues.apache.org/jira/browse/HDFS-7884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14642799#comment-14642799 ] Sangjin Lee commented on HDFS-7884: --- The patch applies to 2.6.0 cleanly. NullPointerException in BlockSender --- Key: HDFS-7884 URL: https://issues.apache.org/jira/browse/HDFS-7884 Project: Hadoop HDFS Issue Type: Bug Components: datanode Reporter: Tsz Wo Nicholas Sze Assignee: Brahma Reddy Battula Priority: Blocker Labels: 2.6.1-candidate Fix For: 2.7.0 Attachments: HDFS-7884-002.patch, HDFS-7884.patch, h7884_20150313.patch, org.apache.hadoop.hdfs.TestAppendSnapshotTruncate-output.txt {noformat} java.lang.NullPointerException at org.apache.hadoop.hdfs.server.datanode.BlockSender.init(BlockSender.java:264) at org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:506) at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opReadBlock(Receiver.java:116) at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:71) at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:249) at java.lang.Thread.run(Thread.java:745) {noformat} BlockSender.java:264 is shown below {code} this.volumeRef = datanode.data.getVolume(block).obtainReference(); {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8810) Correct assertions in TestDFSInotifyEventInputStream class.
[ https://issues.apache.org/jira/browse/HDFS-8810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14642810#comment-14642810 ] Hudson commented on HDFS-8810: -- FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #258 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/258/]) HDFS-8810. Correct assertions in TestDFSInotifyEventInputStream class. Contributed by Surendra Singh Lilhore. (aajisaka: rev 1df78688c69476f89d16f93bc74a4f05d0b1a3da) * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSInotifyEventInputStream.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt Correct assertions in TestDFSInotifyEventInputStream class. --- Key: HDFS-8810 URL: https://issues.apache.org/jira/browse/HDFS-8810 Project: Hadoop HDFS Issue Type: Bug Components: test Affects Versions: 2.6.0, 2.7.1 Reporter: Surendra Singh Lilhore Assignee: Surendra Singh Lilhore Priority: Minor Fix For: 2.8.0 Attachments: HDFS-8810.patch Some assertions are wrong in the test. Here is an example. {code} Assert.assertTrue(batch.getEvents()[0].getEventType() == Event.EventType.RENAME); Event.RenameEvent re3 = (Event.RenameEvent) batch.getEvents()[0]; Assert.assertTrue(re3.getDstPath().equals(/dir/file5)); Assert.assertTrue(re3.getSrcPath().equals(/file5)); Assert.assertTrue(re.getTimestamp() 0); {code} https://issues.apache.org/jira/browse/HDFS-8735?focusedCommentId=14636571page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14636571 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8785) TestDistributedFileSystem is failing in trunk
[ https://issues.apache.org/jira/browse/HDFS-8785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14642766#comment-14642766 ] Xiaoyu Yao commented on HDFS-8785: -- Thanks all for the review. I will commit the latest patch to fix the Jenkins for now and will open separate JIRA to improve timeout related tests via Mock or loop until timeout. TestDistributedFileSystem is failing in trunk - Key: HDFS-8785 URL: https://issues.apache.org/jira/browse/HDFS-8785 Project: Hadoop HDFS Issue Type: Bug Components: test Affects Versions: 3.0.0, 2.8.0 Reporter: Arpit Agarwal Assignee: Xiaoyu Yao Attachments: HDFS-8785.00.patch, HDFS-8785.01.patch, HDFS-8785.02.patch A newly added test case {{TestDistributedFileSystem#testDFSClientPeerWriteTimeout}} is failing in trunk. e.g. run https://builds.apache.org/job/PreCommit-HDFS-Build/11716/testReport/org.apache.hadoop.hdfs/TestDistributedFileSystem/testDFSClientPeerWriteTimeout/ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8785) TestDistributedFileSystem is failing in trunk
[ https://issues.apache.org/jira/browse/HDFS-8785?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiaoyu Yao updated HDFS-8785: - Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: 2.8.0 Status: Resolved (was: Patch Available) Commit to 2.8. TestDistributedFileSystem is failing in trunk - Key: HDFS-8785 URL: https://issues.apache.org/jira/browse/HDFS-8785 Project: Hadoop HDFS Issue Type: Bug Components: test Affects Versions: 3.0.0, 2.8.0 Reporter: Arpit Agarwal Assignee: Xiaoyu Yao Fix For: 2.8.0 Attachments: HDFS-8785.00.patch, HDFS-8785.01.patch, HDFS-8785.02.patch A newly added test case {{TestDistributedFileSystem#testDFSClientPeerWriteTimeout}} is failing in trunk. e.g. run https://builds.apache.org/job/PreCommit-HDFS-Build/11716/testReport/org.apache.hadoop.hdfs/TestDistributedFileSystem/testDFSClientPeerWriteTimeout/ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7930) commitBlockSynchronization() does not remove locations
[ https://issues.apache.org/jira/browse/HDFS-7930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14642820#comment-14642820 ] Sangjin Lee commented on HDFS-7930: --- The backport to 2.6.0 is more or less straightforward but for the following: - the test changes were made in TestFileTruncate which does not exist in 2.6, so it does not apply - changed the method argument type from BlockInfoContiguous (only in 2.7) to BlockInfo - modified the logging statement (the upstream code is for slf4j Logger which is new in 2.7) I'm going to attach a suggested patch for 2.6.0. commitBlockSynchronization() does not remove locations -- Key: HDFS-7930 URL: https://issues.apache.org/jira/browse/HDFS-7930 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.7.0 Reporter: Konstantin Shvachko Assignee: Yi Liu Priority: Blocker Labels: 2.6.1-candidate Fix For: 2.7.0 Attachments: HDFS-7930.001.patch, HDFS-7930.002.patch, HDFS-7930.003.patch When {{commitBlockSynchronization()}} has less {{newTargets}} than in the original block it does not remove unconfirmed locations. This results in that the the block stores locations of different lengths or genStamp (corrupt). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8658) Improve retry attempt log in DFSOutputStream#completeFile with current attempt number
[ https://issues.apache.org/jira/browse/HDFS-8658?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] kanaka kumar avvaru updated HDFS-8658: -- Issue Type: Sub-task (was: Improvement) Parent: HDFS-6185 Improve retry attempt log in DFSOutputStream#completeFile with current attempt number -- Key: HDFS-8658 URL: https://issues.apache.org/jira/browse/HDFS-8658 Project: Hadoop HDFS Issue Type: Sub-task Reporter: kanaka kumar avvaru Assignee: kanaka kumar avvaru Priority: Minor Attachments: HDFS-8658-00.patch In a scenario when block update was delayed from DN to NN, following client log is observed which is confusing that file close failed after two attempts but it tried 5 times actually. So, it's better to log the current and remaining attempt counts. {code} 15/06/23 17:05:10 INFO hdfs.DFSClient: Could not complete /write2/tst5484._COPYING_ retrying... 15/06/23 17:05:16 INFO hdfs.DFSClient: Could not complete /write2/tst5484._COPYING_ retrying... copyFromLocal: Unable to close file because the last block does not have enough number of replicas. {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8810) Correct assertions in TestDFSInotifyEventInputStream class.
[ https://issues.apache.org/jira/browse/HDFS-8810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14642769#comment-14642769 ] Hudson commented on HDFS-8810: -- FAILURE: Integrated in Hadoop-Hdfs-trunk #2196 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/2196/]) HDFS-8810. Correct assertions in TestDFSInotifyEventInputStream class. Contributed by Surendra Singh Lilhore. (aajisaka: rev 1df78688c69476f89d16f93bc74a4f05d0b1a3da) * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSInotifyEventInputStream.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt Correct assertions in TestDFSInotifyEventInputStream class. --- Key: HDFS-8810 URL: https://issues.apache.org/jira/browse/HDFS-8810 Project: Hadoop HDFS Issue Type: Bug Components: test Affects Versions: 2.6.0, 2.7.1 Reporter: Surendra Singh Lilhore Assignee: Surendra Singh Lilhore Priority: Minor Fix For: 2.8.0 Attachments: HDFS-8810.patch Some assertions are wrong in the test. Here is an example. {code} Assert.assertTrue(batch.getEvents()[0].getEventType() == Event.EventType.RENAME); Event.RenameEvent re3 = (Event.RenameEvent) batch.getEvents()[0]; Assert.assertTrue(re3.getDstPath().equals(/dir/file5)); Assert.assertTrue(re3.getSrcPath().equals(/file5)); Assert.assertTrue(re.getTimestamp() 0); {code} https://issues.apache.org/jira/browse/HDFS-8735?focusedCommentId=14636571page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14636571 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7930) commitBlockSynchronization() does not remove locations
[ https://issues.apache.org/jira/browse/HDFS-7930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sangjin Lee updated HDFS-7930: -- Attachment: HDFS-7930-2.6.0.patch commitBlockSynchronization() does not remove locations -- Key: HDFS-7930 URL: https://issues.apache.org/jira/browse/HDFS-7930 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.7.0 Reporter: Konstantin Shvachko Assignee: Yi Liu Priority: Blocker Labels: 2.6.1-candidate Fix For: 2.7.0 Attachments: HDFS-7930-2.6.0.patch, HDFS-7930.001.patch, HDFS-7930.002.patch, HDFS-7930.003.patch When {{commitBlockSynchronization()}} has less {{newTargets}} than in the original block it does not remove unconfirmed locations. This results in that the the block stores locations of different lengths or genStamp (corrupt). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8810) Correct assertions in TestDFSInotifyEventInputStream class.
[ https://issues.apache.org/jira/browse/HDFS-8810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14642885#comment-14642885 ] Hudson commented on HDFS-8810: -- SUCCESS: Integrated in Hadoop-Mapreduce-trunk #2215 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2215/]) HDFS-8810. Correct assertions in TestDFSInotifyEventInputStream class. Contributed by Surendra Singh Lilhore. (aajisaka: rev 1df78688c69476f89d16f93bc74a4f05d0b1a3da) * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSInotifyEventInputStream.java Correct assertions in TestDFSInotifyEventInputStream class. --- Key: HDFS-8810 URL: https://issues.apache.org/jira/browse/HDFS-8810 Project: Hadoop HDFS Issue Type: Bug Components: test Affects Versions: 2.6.0, 2.7.1 Reporter: Surendra Singh Lilhore Assignee: Surendra Singh Lilhore Priority: Minor Fix For: 2.8.0 Attachments: HDFS-8810.patch Some assertions are wrong in the test. Here is an example. {code} Assert.assertTrue(batch.getEvents()[0].getEventType() == Event.EventType.RENAME); Event.RenameEvent re3 = (Event.RenameEvent) batch.getEvents()[0]; Assert.assertTrue(re3.getDstPath().equals(/dir/file5)); Assert.assertTrue(re3.getSrcPath().equals(/file5)); Assert.assertTrue(re.getTimestamp() 0); {code} https://issues.apache.org/jira/browse/HDFS-8735?focusedCommentId=14636571page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14636571 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8695) OzoneHandler : Add Bucket REST Interface
[ https://issues.apache.org/jira/browse/HDFS-8695?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14643086#comment-14643086 ] Anu Engineer commented on HDFS-8695: bq. VolumeHandler#getVolumeInfo requires volume name as parameter. i.e. A URL /IP:PORT/ozone/volume1?info=service which lists all volume names doesn't look good. bq. I feel list of volumes service should be another method which should be mapped to root path of Ozone as volumes are the top containers of Ozone. bq. FYI, In S3 the list of the buckets are handled on the root of the resource path http://docs.aws.amazon.com/AmazonS3/latest/API/RESTServiceGET.html That is exactly how it is from the protocol perspective. The info=service is never exposed to user. It is an internal token generated in code. The end user simply issues GET /. When you get to the point of running code, you will see that end-users will simply issue GET / and it will return a volumes. In fact our API is exactly same as AWS for this call. You should be able to see this and test it out when we have the HTTP handler code in place. OzoneHandler : Add Bucket REST Interface Key: HDFS-8695 URL: https://issues.apache.org/jira/browse/HDFS-8695 Project: Hadoop HDFS Issue Type: Sub-task Components: ozone Reporter: Anu Engineer Assignee: Anu Engineer Attachments: hdfs-8695-HDFS-7240.001.patch, hdfs-8695-HDFS-7240.002.patch Add Bucket REST interface into Ozone server. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8344) NameNode doesn't recover lease for files with missing blocks
[ https://issues.apache.org/jira/browse/HDFS-8344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14643126#comment-14643126 ] Ravi Prakash commented on HDFS-8344: Thanks a lot Masatake! Much appreciated. [~wheat9] Could you too please review the latest patch? NameNode doesn't recover lease for files with missing blocks Key: HDFS-8344 URL: https://issues.apache.org/jira/browse/HDFS-8344 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.7.0 Reporter: Ravi Prakash Assignee: Ravi Prakash Fix For: 2.8.0 Attachments: HDFS-8344.01.patch, HDFS-8344.02.patch, HDFS-8344.03.patch, HDFS-8344.04.patch, HDFS-8344.05.patch, HDFS-8344.06.patch, HDFS-8344.07.patch, HDFS-8344.08.patch I found another\(?) instance in which the lease is not recovered. This is reproducible easily on a pseudo-distributed single node cluster # Before you start it helps if you set. This is not necessary, but simply reduces how long you have to wait {code} public static final long LEASE_SOFTLIMIT_PERIOD = 30 * 1000; public static final long LEASE_HARDLIMIT_PERIOD = 2 * LEASE_SOFTLIMIT_PERIOD; {code} # Client starts to write a file. (could be less than 1 block, but it hflushed so some of the data has landed on the datanodes) (I'm copying the client code I am using. I generate a jar and run it using $ hadoop jar TestHadoop.jar) # Client crashes. (I simulate this by kill -9 the $(hadoop jar TestHadoop.jar) process after it has printed Wrote to the bufferedWriter # Shoot the datanode. (Since I ran on a pseudo-distributed cluster, there was only 1) I believe the lease should be recovered and the block should be marked missing. However this is not happening. The lease is never recovered. The effect of this bug for us was that nodes could not be decommissioned cleanly. Although we knew that the client had crashed, the Namenode never released the leases (even after restarting the Namenode) (even months afterwards). There are actually several other cases too where we don't consider what happens if ALL the datanodes die while the file is being written, but I am going to punt on that for another time. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7858) Improve HA Namenode Failover detection on the client
[ https://issues.apache.org/jira/browse/HDFS-7858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14643102#comment-14643102 ] Arpit Agarwal commented on HDFS-7858: - I am +1 on the v11 patch. {{RequestHedgingProxyProvider}} is disabled by default so remaining issues can be addressed separately to avoid spinning on this forever. :-) # One optimization with your new approach - In the common HA case with two NameNodes, after {{performFailover}} is called, {{toIgnore}} will be non-null. We don't need to create a thread pool/completion service, we can simply send the request to the single proxy in the callers thread. # The TODO is not technically a TODO. We can just document this property in the class Javadoc that it can block indefinitely and depends on the caller implementing a timeout. # Couple of documentation nitpicks: ## _The two implementations which currently ships_ - _The two implementations which currently ship_ ## _so use these_ -- _so use one of these unless you are using a custom proxy provider_ Will hold off committing in case [~jingzhao] has any further comments. Thanks for working on this [~asuresh]. Improve HA Namenode Failover detection on the client Key: HDFS-7858 URL: https://issues.apache.org/jira/browse/HDFS-7858 Project: Hadoop HDFS Issue Type: Improvement Reporter: Arun Suresh Assignee: Arun Suresh Labels: BB2015-05-TBR Attachments: HDFS-7858.1.patch, HDFS-7858.10.patch, HDFS-7858.10.patch, HDFS-7858.11.patch, HDFS-7858.2.patch, HDFS-7858.2.patch, HDFS-7858.3.patch, HDFS-7858.4.patch, HDFS-7858.5.patch, HDFS-7858.6.patch, HDFS-7858.7.patch, HDFS-7858.8.patch, HDFS-7858.9.patch In an HA deployment, Clients are configured with the hostnames of both the Active and Standby Namenodes.Clients will first try one of the NNs (non-deterministically) and if its a standby NN, then it will respond to the client to retry the request on the other Namenode. If the client happens to talks to the Standby first, and the standby is undergoing some GC / is busy, then those clients might not get a response soon enough to try the other NN. Proposed Approach to solve this : 1) Since Zookeeper is already used as the failover controller, the clients could talk to ZK and find out which is the active namenode before contacting it. 2) Long-lived DFSClients would have a ZK watch configured which fires when there is a failover so they do not have to query ZK everytime to find out the active NN 2) Clients can also cache the last active NN in the user's home directory (~/.lastNN) so that short-lived clients can try that Namenode first before querying ZK -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8695) OzoneHandler : Add Bucket REST Interface
[ https://issues.apache.org/jira/browse/HDFS-8695?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anu Engineer updated HDFS-8695: --- Attachment: hdfs-8695-HDFS-7240.003.patch Removed spurious edit in volumehandler.java OzoneHandler : Add Bucket REST Interface Key: HDFS-8695 URL: https://issues.apache.org/jira/browse/HDFS-8695 Project: Hadoop HDFS Issue Type: Sub-task Components: ozone Reporter: Anu Engineer Assignee: Anu Engineer Attachments: hdfs-8695-HDFS-7240.001.patch, hdfs-8695-HDFS-7240.002.patch, hdfs-8695-HDFS-7240.003.patch Add Bucket REST interface into Ozone server. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8816) Improve visualization for the Datanode tab in the NN UI
[ https://issues.apache.org/jira/browse/HDFS-8816?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haohui Mai updated HDFS-8816: - Attachment: HDFS-8816.003.patch Improve visualization for the Datanode tab in the NN UI --- Key: HDFS-8816 URL: https://issues.apache.org/jira/browse/HDFS-8816 Project: Hadoop HDFS Issue Type: Improvement Reporter: Haohui Mai Assignee: Haohui Mai Attachments: HDFS-8816.000.patch, HDFS-8816.001.patch, HDFS-8816.002.patch, HDFS-8816.003.patch, HDFS-8816.png, Screen Shot 2015-07-23 at 10.24.24 AM.png The information of the datanode tab in the NN UI is clogged. This jira proposes to improve the visualization of the datanode tab in the UI. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7858) Improve HA Namenode Failover detection on the client
[ https://issues.apache.org/jira/browse/HDFS-7858?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun Suresh updated HDFS-7858: -- Attachment: HDFS-7858.12.patch Thanks for the rev [~arpitagarwal], Updated patch : * Incorporated your optimization with a minor modification (with consideration for case where you might have more than 2 proxies configured). Also updated * Updated docs * Updated testcases Will commit it by tomorrow, if you and [~jingzhao] are ok with the latest patch Improve HA Namenode Failover detection on the client Key: HDFS-7858 URL: https://issues.apache.org/jira/browse/HDFS-7858 Project: Hadoop HDFS Issue Type: Improvement Reporter: Arun Suresh Assignee: Arun Suresh Labels: BB2015-05-TBR Attachments: HDFS-7858.1.patch, HDFS-7858.10.patch, HDFS-7858.10.patch, HDFS-7858.11.patch, HDFS-7858.12.patch, HDFS-7858.2.patch, HDFS-7858.2.patch, HDFS-7858.3.patch, HDFS-7858.4.patch, HDFS-7858.5.patch, HDFS-7858.6.patch, HDFS-7858.7.patch, HDFS-7858.8.patch, HDFS-7858.9.patch In an HA deployment, Clients are configured with the hostnames of both the Active and Standby Namenodes.Clients will first try one of the NNs (non-deterministically) and if its a standby NN, then it will respond to the client to retry the request on the other Namenode. If the client happens to talks to the Standby first, and the standby is undergoing some GC / is busy, then those clients might not get a response soon enough to try the other NN. Proposed Approach to solve this : 1) Since Zookeeper is already used as the failover controller, the clients could talk to ZK and find out which is the active namenode before contacting it. 2) Long-lived DFSClients would have a ZK watch configured which fires when there is a failover so they do not have to query ZK everytime to find out the active NN 2) Clients can also cache the last active NN in the user's home directory (~/.lastNN) so that short-lived clients can try that Namenode first before querying ZK -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8818) Allow Balancer to run faster
[ https://issues.apache.org/jira/browse/HDFS-8818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14643274#comment-14643274 ] Jitendra Nath Pandey commented on HDFS-8818: +1 conditional on addressing findbugs/style issues. Allow Balancer to run faster Key: HDFS-8818 URL: https://issues.apache.org/jira/browse/HDFS-8818 Project: Hadoop HDFS Issue Type: Improvement Components: balancer mover Reporter: Tsz Wo Nicholas Sze Assignee: Tsz Wo Nicholas Sze Attachments: h8818_20150723.patch The original design of Balancer is intentionally to make it run slowly so that the balancing activities won't affect the normal cluster activities and the running jobs. There are new use case that cluster admin may choose to balance the cluster when the cluster load is low, or in a maintain window. So that we should have an option to allow Balancer to run faster. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8816) Improve visualization for the Datanode tab in the NN UI
[ https://issues.apache.org/jira/browse/HDFS-8816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14643166#comment-14643166 ] Haohui Mai commented on HDFS-8816: -- Thanks [~raviprak] for the reviews. The v3 patch addresses the comments. bq. At a lower zoom, the capacity becomes indented awkwardly. Also, there's no indication of %ge in the capacity. The v3 patch should fix the issue. I noticed that there's no percentage in the previous UI and now the UI use different color to indicate the percentage. Maybe we can file a new jira to add precise percentage if you think it is helpful? bq. With DIFP Failed Volumes would be very useful to have. Is there some other place in the UI we can get that information? The information should be available through the DataNode Volume Failures tab. bq. Although I don't particularly care for the Non DFS Used myself, could you please confirm you meant to remove it? The information is now presented as a popup of the capacity bar. bq. If I had a timestamp there, it would make it a lot easier for me to grep through logs. Good point. The v3 patch shows the timestamp. Improve visualization for the Datanode tab in the NN UI --- Key: HDFS-8816 URL: https://issues.apache.org/jira/browse/HDFS-8816 Project: Hadoop HDFS Issue Type: Improvement Reporter: Haohui Mai Assignee: Haohui Mai Attachments: HDFS-8816.000.patch, HDFS-8816.001.patch, HDFS-8816.002.patch, HDFS-8816.003.patch, HDFS-8816.png, Screen Shot 2015-07-23 at 10.24.24 AM.png The information of the datanode tab in the NN UI is clogged. This jira proposes to improve the visualization of the datanode tab in the UI. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8695) OzoneHandler : Add Bucket REST Interface
[ https://issues.apache.org/jira/browse/HDFS-8695?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14643455#comment-14643455 ] Hadoop QA commented on HDFS-8695: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:red}-1{color} | pre-patch | 15m 2s | Findbugs (version ) appears to be broken on HDFS-7240. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:red}-1{color} | tests included | 0m 0s | The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. | | {color:green}+1{color} | javac | 7m 35s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 9m 35s | There were no new javadoc warning messages. | | {color:red}-1{color} | release audit | 0m 19s | The applied patch generated 1 release audit warnings. | | {color:green}+1{color} | checkstyle | 0m 34s | There were no new checkstyle issues. | | {color:green}+1{color} | whitespace | 0m 1s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 33s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 34s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 2m 30s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | native | 3m 5s | Pre-build of native portion | | {color:red}-1{color} | hdfs tests | 160m 39s | Tests failed in hadoop-hdfs. | | | | 201m 33s | | \\ \\ || Reason || Tests || | Failed unit tests | hadoop.hdfs.TestDistributedFileSystem | | | hadoop.hdfs.server.namenode.ha.TestStandbyIsHot | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12747394/hdfs-8695-HDFS-7240.003.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | HDFS-7240 / ef128ee | | Release Audit | https://builds.apache.org/job/PreCommit-HDFS-Build/11846/artifact/patchprocess/patchReleaseAuditProblems.txt | | hadoop-hdfs test log | https://builds.apache.org/job/PreCommit-HDFS-Build/11846/artifact/patchprocess/testrun_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/11846/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf909.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/11846/console | This message was automatically generated. OzoneHandler : Add Bucket REST Interface Key: HDFS-8695 URL: https://issues.apache.org/jira/browse/HDFS-8695 Project: Hadoop HDFS Issue Type: Sub-task Components: ozone Reporter: Anu Engineer Assignee: Anu Engineer Attachments: hdfs-8695-HDFS-7240.001.patch, hdfs-8695-HDFS-7240.002.patch, hdfs-8695-HDFS-7240.003.patch Add Bucket REST interface into Ozone server. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7858) Improve HA Namenode Failover detection on the client
[ https://issues.apache.org/jira/browse/HDFS-7858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14643339#comment-14643339 ] Jing Zhao commented on HDFS-7858: - Thanks for working on this, [~asuresh]! The latest patch looks good to me. +1. Also agree with [~arpitagarwal] that we can keep testing and improving this since this is currently not default. Improve HA Namenode Failover detection on the client Key: HDFS-7858 URL: https://issues.apache.org/jira/browse/HDFS-7858 Project: Hadoop HDFS Issue Type: Improvement Reporter: Arun Suresh Assignee: Arun Suresh Labels: BB2015-05-TBR Attachments: HDFS-7858.1.patch, HDFS-7858.10.patch, HDFS-7858.10.patch, HDFS-7858.11.patch, HDFS-7858.12.patch, HDFS-7858.2.patch, HDFS-7858.2.patch, HDFS-7858.3.patch, HDFS-7858.4.patch, HDFS-7858.5.patch, HDFS-7858.6.patch, HDFS-7858.7.patch, HDFS-7858.8.patch, HDFS-7858.9.patch In an HA deployment, Clients are configured with the hostnames of both the Active and Standby Namenodes.Clients will first try one of the NNs (non-deterministically) and if its a standby NN, then it will respond to the client to retry the request on the other Namenode. If the client happens to talks to the Standby first, and the standby is undergoing some GC / is busy, then those clients might not get a response soon enough to try the other NN. Proposed Approach to solve this : 1) Since Zookeeper is already used as the failover controller, the clients could talk to ZK and find out which is the active namenode before contacting it. 2) Long-lived DFSClients would have a ZK watch configured which fires when there is a failover so they do not have to query ZK everytime to find out the active NN 2) Clients can also cache the last active NN in the user's home directory (~/.lastNN) so that short-lived clients can try that Namenode first before querying ZK -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8816) Improve visualization for the Datanode tab in the NN UI
[ https://issues.apache.org/jira/browse/HDFS-8816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14643473#comment-14643473 ] Hadoop QA commented on HDFS-8816: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 14m 58s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:red}-1{color} | tests included | 0m 0s | The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. | | {color:green}+1{color} | javac | 7m 40s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 9m 42s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 23s | The applied patch does not increase the total number of release audit warnings. | | {color:green}+1{color} | whitespace | 0m 0s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 22s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 34s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | native | 3m 5s | Pre-build of native portion | | {color:green}+1{color} | hdfs tests | 161m 49s | Tests passed in hadoop-hdfs. | | | | 199m 37s | | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12747396/HDFS-8816.003.patch | | Optional Tests | javadoc javac unit | | git revision | trunk / 2196e39 | | hadoop-hdfs test log | https://builds.apache.org/job/PreCommit-HDFS-Build/11847/artifact/patchprocess/testrun_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/11847/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf900.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/11847/console | This message was automatically generated. Improve visualization for the Datanode tab in the NN UI --- Key: HDFS-8816 URL: https://issues.apache.org/jira/browse/HDFS-8816 Project: Hadoop HDFS Issue Type: Improvement Reporter: Haohui Mai Assignee: Haohui Mai Attachments: HDFS-8816.000.patch, HDFS-8816.001.patch, HDFS-8816.002.patch, HDFS-8816.003.patch, HDFS-8816.png, Screen Shot 2015-07-23 at 10.24.24 AM.png The information of the datanode tab in the NN UI is clogged. This jira proposes to improve the visualization of the datanode tab in the UI. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8823) Move replication factor into individual blocks
[ https://issues.apache.org/jira/browse/HDFS-8823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14643586#comment-14643586 ] Andrew Wang commented on HDFS-8823: --- Any comment on additional memory consumption? Move replication factor into individual blocks -- Key: HDFS-8823 URL: https://issues.apache.org/jira/browse/HDFS-8823 Project: Hadoop HDFS Issue Type: Improvement Reporter: Haohui Mai Assignee: Haohui Mai Attachments: HDFS-8823.WIP.patch This jira proposes to record the replication factor in the {{BlockInfo}} class. The changes have two advantages: * Decoupling the namespace and the block management layer. It is a prerequisite step to move block management off the heap or to a separate process. * Increased flexibility on replicating blocks. Currently the replication factors of all blocks have to be the same. The replication factors of these blocks are equal to the highest replication factor across all snapshots. The changes will allow blocks in a file to have different replication factor, potentially saving some space. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7858) Improve HA Namenode Failover detection on the client
[ https://issues.apache.org/jira/browse/HDFS-7858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14643603#comment-14643603 ] Hadoop QA commented on HDFS-7858: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 21m 42s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 1 new or modified test files. | | {color:red}-1{color} | javac | 7m 31s | The applied patch generated 1 additional warning messages. | | {color:green}+1{color} | javadoc | 9m 36s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 24s | The applied patch does not increase the total number of release audit warnings. | | {color:green}+1{color} | site | 2m 59s | Site still builds. | | {color:green}+1{color} | checkstyle | 2m 1s | There were no new checkstyle issues. | | {color:green}+1{color} | whitespace | 0m 1s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 27s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 33s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 4m 19s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:red}-1{color} | common tests | 22m 20s | Tests failed in hadoop-common. | | {color:red}-1{color} | hdfs tests | 162m 20s | Tests failed in hadoop-hdfs. | | | | 235m 17s | | \\ \\ || Reason || Tests || | Failed unit tests | hadoop.ipc.TestRPC | | | hadoop.fs.TestLocalFsFCStatistics | | | hadoop.hdfs.server.namenode.ha.TestStandbyIsHot | | | hadoop.hdfs.server.namenode.ha.TestRequestHedgingProxyProvider | | | hadoop.hdfs.server.namenode.TestFsck | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12747399/HDFS-7858.12.patch | | Optional Tests | javadoc javac unit findbugs checkstyle site | | git revision | trunk / f36835f | | javac | https://builds.apache.org/job/PreCommit-HDFS-Build/11848/artifact/patchprocess/diffJavacWarnings.txt | | hadoop-common test log | https://builds.apache.org/job/PreCommit-HDFS-Build/11848/artifact/patchprocess/testrun_hadoop-common.txt | | hadoop-hdfs test log | https://builds.apache.org/job/PreCommit-HDFS-Build/11848/artifact/patchprocess/testrun_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/11848/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf906.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/11848/console | This message was automatically generated. Improve HA Namenode Failover detection on the client Key: HDFS-7858 URL: https://issues.apache.org/jira/browse/HDFS-7858 Project: Hadoop HDFS Issue Type: Improvement Reporter: Arun Suresh Assignee: Arun Suresh Labels: BB2015-05-TBR Attachments: HDFS-7858.1.patch, HDFS-7858.10.patch, HDFS-7858.10.patch, HDFS-7858.11.patch, HDFS-7858.12.patch, HDFS-7858.2.patch, HDFS-7858.2.patch, HDFS-7858.3.patch, HDFS-7858.4.patch, HDFS-7858.5.patch, HDFS-7858.6.patch, HDFS-7858.7.patch, HDFS-7858.8.patch, HDFS-7858.9.patch In an HA deployment, Clients are configured with the hostnames of both the Active and Standby Namenodes.Clients will first try one of the NNs (non-deterministically) and if its a standby NN, then it will respond to the client to retry the request on the other Namenode. If the client happens to talks to the Standby first, and the standby is undergoing some GC / is busy, then those clients might not get a response soon enough to try the other NN. Proposed Approach to solve this : 1) Since Zookeeper is already used as the failover controller, the clients could talk to ZK and find out which is the active namenode before contacting it. 2) Long-lived DFSClients would have a ZK watch configured which fires when there is a failover so they do not have to query ZK everytime to find out the active NN 2) Clients can also cache the last active NN in the user's home directory (~/.lastNN) so that short-lived clients can try that Namenode first before querying ZK -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8816) Improve visualization for the Datanode tab in the NN UI
[ https://issues.apache.org/jira/browse/HDFS-8816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14643562#comment-14643562 ] Haohui Mai commented on HDFS-8816: -- [~raviprak] does the latest patch look good to you? Improve visualization for the Datanode tab in the NN UI --- Key: HDFS-8816 URL: https://issues.apache.org/jira/browse/HDFS-8816 Project: Hadoop HDFS Issue Type: Improvement Reporter: Haohui Mai Assignee: Haohui Mai Attachments: HDFS-8816.000.patch, HDFS-8816.001.patch, HDFS-8816.002.patch, HDFS-8816.003.patch, HDFS-8816.png, Screen Shot 2015-07-23 at 10.24.24 AM.png The information of the datanode tab in the NN UI is clogged. This jira proposes to improve the visualization of the datanode tab in the UI. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8823) Move replication factor into individual blocks
[ https://issues.apache.org/jira/browse/HDFS-8823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14643569#comment-14643569 ] Haohui Mai commented on HDFS-8823: -- The WIP patch demonstrates the approach for the changes. It's not ready to run Jenkins yet. The key idea is to put the replication factor into the {{BlockInfo}} class, and to update the replication factors in {{setReplication()}} and snapshot related operations. Move replication factor into individual blocks -- Key: HDFS-8823 URL: https://issues.apache.org/jira/browse/HDFS-8823 Project: Hadoop HDFS Issue Type: Improvement Reporter: Haohui Mai Assignee: Haohui Mai Attachments: HDFS-8823.WIP.patch This jira proposes to record the replication factor in the {{BlockInfo}} class. The changes have two advantages: * Decoupling the namespace and the block management layer. It is a prerequisite step to move block management off the heap or to a separate process. * Increased flexibility on replicating blocks. Currently the replication factors of all blocks have to be the same. The replication factors of these blocks are equal to the highest replication factor across all snapshots. The changes will allow blocks in a file to have different replication factor, potentially saving some space. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-8824) Do not use small blocks for balancing the cluster
Tsz Wo Nicholas Sze created HDFS-8824: - Summary: Do not use small blocks for balancing the cluster Key: HDFS-8824 URL: https://issues.apache.org/jira/browse/HDFS-8824 Project: Hadoop HDFS Issue Type: Improvement Components: balancer mover Reporter: Tsz Wo Nicholas Sze Assignee: Tsz Wo Nicholas Sze Balancer gets datanode block lists from NN and then move the blocks in order to balance the cluster. It should not use the blocks with small size since moving the small blocks generates a lot of overhead and the small blocks do not help balancing the cluster much. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8818) Allow Balancer to run faster
[ https://issues.apache.org/jira/browse/HDFS-8818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo Nicholas Sze updated HDFS-8818: -- Attachment: h8818_20150727.patch h8818_20150727.patch: fixes the findbugs warning and some trailing whitespaces. Allow Balancer to run faster Key: HDFS-8818 URL: https://issues.apache.org/jira/browse/HDFS-8818 Project: Hadoop HDFS Issue Type: Improvement Components: balancer mover Reporter: Tsz Wo Nicholas Sze Assignee: Tsz Wo Nicholas Sze Attachments: h8818_20150723.patch, h8818_20150727.patch The original design of Balancer is intentionally to make it run slowly so that the balancing activities won't affect the normal cluster activities and the running jobs. There are new use case that cluster admin may choose to balance the cluster when the cluster load is low, or in a maintain window. So that we should have an option to allow Balancer to run faster. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8823) Move replication factor into individual blocks
[ https://issues.apache.org/jira/browse/HDFS-8823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haohui Mai updated HDFS-8823: - Attachment: HDFS-8823.WIP.patch Move replication factor into individual blocks -- Key: HDFS-8823 URL: https://issues.apache.org/jira/browse/HDFS-8823 Project: Hadoop HDFS Issue Type: Improvement Reporter: Haohui Mai Assignee: Haohui Mai Attachments: HDFS-8823.WIP.patch This jira proposes to record the replication factor in the {{BlockInfo}} class. The changes have two advantages: * Decoupling the namespace and the block management layer. It is a prerequisite step to move block management off the heap or to a separate process. * Increased flexibility on replicating blocks. Currently the replication factors of all blocks have to be the same. The replication factors of these blocks are equal to the highest replication factor across all snapshots. The changes will allow blocks in a file to have different replication factor, potentially saving some space. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8695) OzoneHandler : Add Bucket REST Interface
[ https://issues.apache.org/jira/browse/HDFS-8695?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14643503#comment-14643503 ] Arpit Agarwal commented on HDFS-8695: - Hi [~anu], the patch looks great. One comment - {{handleIOException}} looks out of place in {{BucketProcessTemplate}}. IIUC isn't that exception mapping specific to {{LocalStorageHandler}}? +1 otherwise. OzoneHandler : Add Bucket REST Interface Key: HDFS-8695 URL: https://issues.apache.org/jira/browse/HDFS-8695 Project: Hadoop HDFS Issue Type: Sub-task Components: ozone Reporter: Anu Engineer Assignee: Anu Engineer Attachments: hdfs-8695-HDFS-7240.001.patch, hdfs-8695-HDFS-7240.002.patch, hdfs-8695-HDFS-7240.003.patch Add Bucket REST interface into Ozone server. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-8823) Move replication factor into individual blocks
Haohui Mai created HDFS-8823: Summary: Move replication factor into individual blocks Key: HDFS-8823 URL: https://issues.apache.org/jira/browse/HDFS-8823 Project: Hadoop HDFS Issue Type: Improvement Reporter: Haohui Mai Assignee: Haohui Mai This jira proposes to record the replication factor in the {{BlockInfo}} class. The changes have two advantages: * Decoupling the namespace and the block management layer. It is a prerequisite step to move block management off the heap or to a separate process. * Increased flexibility on replicating blocks. Currently the replication factors of all blocks have to be the same. The replication factors of these blocks are equal to the highest replication factor across all snapshots. The changes will allow blocks in a file to have different replication factor, potentially saving some space. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8824) Do not use small blocks for balancing the cluster
[ https://issues.apache.org/jira/browse/HDFS-8824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo Nicholas Sze updated HDFS-8824: -- Status: Patch Available (was: Open) Do not use small blocks for balancing the cluster - Key: HDFS-8824 URL: https://issues.apache.org/jira/browse/HDFS-8824 Project: Hadoop HDFS Issue Type: Improvement Components: balancer mover Reporter: Tsz Wo Nicholas Sze Assignee: Tsz Wo Nicholas Sze Attachments: h8824_20150727b.patch Balancer gets datanode block lists from NN and then move the blocks in order to balance the cluster. It should not use the blocks with small size since moving the small blocks generates a lot of overhead and the small blocks do not help balancing the cluster much. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8824) Do not use small blocks for balancing the cluster
[ https://issues.apache.org/jira/browse/HDFS-8824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo Nicholas Sze updated HDFS-8824: -- Attachment: h8824_20150727b.patch h8824_20150727b.patch: add minBlockSize to getBlocks(..). Do not use small blocks for balancing the cluster - Key: HDFS-8824 URL: https://issues.apache.org/jira/browse/HDFS-8824 Project: Hadoop HDFS Issue Type: Improvement Components: balancer mover Reporter: Tsz Wo Nicholas Sze Assignee: Tsz Wo Nicholas Sze Attachments: h8824_20150727b.patch Balancer gets datanode block lists from NN and then move the blocks in order to balance the cluster. It should not use the blocks with small size since moving the small blocks generates a lot of overhead and the small blocks do not help balancing the cluster much. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-8825) Enhancements to Balancer
Tsz Wo Nicholas Sze created HDFS-8825: - Summary: Enhancements to Balancer Key: HDFS-8825 URL: https://issues.apache.org/jira/browse/HDFS-8825 Project: Hadoop HDFS Issue Type: Improvement Components: balancer mover Reporter: Tsz Wo Nicholas Sze Assignee: Tsz Wo Nicholas Sze This is an umbrella JIRA to enhance Balancer. The goal is to make it runs faster, more efficient and improve its usability. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-8826) Balancer may not move blocks efficiently in some cases
Tsz Wo Nicholas Sze created HDFS-8826: - Summary: Balancer may not move blocks efficiently in some cases Key: HDFS-8826 URL: https://issues.apache.org/jira/browse/HDFS-8826 Project: Hadoop HDFS Issue Type: Sub-task Components: balancer mover Reporter: Tsz Wo Nicholas Sze Assignee: Tsz Wo Nicholas Sze Balancer is inefficient in the following case: || Datanode || Utilization || Rack || | D1 | 95% | A | | D2 | 30% | B | | D3, D4, D5 | 0% | B | The average utilization is 25% so that D2 is within 10% threshold. However, Balancer currently will first move blocks from D2 to D3, D4 and D5 since they are under the same rack. Then, it will move blocks from D1. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-6861) Separate Balancer specific logic form Dispatcher
[ https://issues.apache.org/jira/browse/HDFS-6861?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo Nicholas Sze updated HDFS-6861: -- Issue Type: Sub-task (was: Improvement) Parent: HDFS-8825 Separate Balancer specific logic form Dispatcher Key: HDFS-6861 URL: https://issues.apache.org/jira/browse/HDFS-6861 Project: Hadoop HDFS Issue Type: Sub-task Components: balancer mover Reporter: Tsz Wo Nicholas Sze Assignee: Tsz Wo Nicholas Sze Labels: BB2015-05-TBR Attachments: h6861_20140818.patch, h6861_20140819.patch In order to balance datanode storage utilization of a cluster, Balancer (1) classifies datanodes into different groups (overUtilized, aboveAvgUtilized, belowAvgUtilized and underUtilized), (2) chooses source and target datanode pairs and (3) chooses blocks to move. Some of these logic are in Dispatcher. It is better to separate them out. This JIRA is a further work of HDFS-6828. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HDFS-852) Balancer shutdown synchronisation could do with a review
[ https://issues.apache.org/jira/browse/HDFS-852?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo Nicholas Sze resolved HDFS-852. -- Resolution: Not A Problem I think this issue got stale. Resolving as Not a Problem. Please feel free to reopen if you disagree. Balancer shutdown synchronisation could do with a review Key: HDFS-852 URL: https://issues.apache.org/jira/browse/HDFS-852 Project: Hadoop HDFS Issue Type: Improvement Components: balancer mover Affects Versions: 0.22.0 Reporter: Steve Loughran Priority: Minor Looking at the source of the Balancer, there's a lot {{catch(InterruptedException)}} clauses, which runs the risk of swallowing exceptions, making it harder to shut down a balancer. for example, the {{AccessKeyUpdater swallows the InterruptedExceptions which get used to tell it to shut down, and while it does poll the shared field {{shouldRun}}, that field isn't volatile: the shutdown may }}not work. Elsewhere, the {{dispatchBlocks()}} method swallows interruptions without even looking for any shutdown flag. This is all minor as it is shutdown logic, but it is the stuff that it hard to test and leads to problems in the field, the problems that leave the ops team resorting to {{kill -9}}, and we don't want that. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-6861) Separate Balancer specific logic form Dispatcher
[ https://issues.apache.org/jira/browse/HDFS-6861?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14643711#comment-14643711 ] Hadoop QA commented on HDFS-6861: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:red}-1{color} | patch | 0m 0s | The patch command could not apply the patch during dryrun. | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12662699/h6861_20140819.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / 3572ebd | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/11851/console | This message was automatically generated. Separate Balancer specific logic form Dispatcher Key: HDFS-6861 URL: https://issues.apache.org/jira/browse/HDFS-6861 Project: Hadoop HDFS Issue Type: Sub-task Components: balancer mover Reporter: Tsz Wo Nicholas Sze Assignee: Tsz Wo Nicholas Sze Labels: BB2015-05-TBR Attachments: h6861_20140818.patch, h6861_20140819.patch In order to balance datanode storage utilization of a cluster, Balancer (1) classifies datanodes into different groups (overUtilized, aboveAvgUtilized, belowAvgUtilized and underUtilized), (2) chooses source and target datanode pairs and (3) chooses blocks to move. Some of these logic are in Dispatcher. It is better to separate them out. This JIRA is a further work of HDFS-6828. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8768) Erasure Coding: block group ID displayed in WebUI is not consistent with fsck
[ https://issues.apache.org/jira/browse/HDFS-8768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14643719#comment-14643719 ] GAO Rui commented on HDFS-8768: --- Yes, I think so too. But It seems like that I could not make this jira to the status of closed. Can you help me to close this JIRA? Thank you [~zhz] Erasure Coding: block group ID displayed in WebUI is not consistent with fsck - Key: HDFS-8768 URL: https://issues.apache.org/jira/browse/HDFS-8768 Project: Hadoop HDFS Issue Type: Sub-task Reporter: GAO Rui Attachments: Screen Shot 2015-07-14 at 15.33.08.png, screen-shot-with-HDFS-8779-patch.PNG This is duplicated by [HDFS-8779]. For example, In WebUI( usually, namenode port: 50070) , one Erasure Code file with one block group was displayed as the attached screenshot [^Screen Shot 2015-07-14 at 15.33.08.png]. But, with fsck command, the block group of the same file was displayed like: {{0. BP-1130999596-172.23.38.10-1433791629728:blk_-9223372036854740160_3384 len=6438256640}} After checking block file names in datanodes, we believe WebUI may have some problem with Erasure Code block group display. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-5958) One very large node in a cluster prevents balancer from balancing data
[ https://issues.apache.org/jira/browse/HDFS-5958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14643750#comment-14643750 ] Tsz Wo Nicholas Sze commented on HDFS-5958: --- ... in my case I had a 20% utilized box, a few 80-90% utilized ones and that huge 4% utilized machine, ... In this case, Balancer should move blocks from the 80-90% utilized machines to the other machines. No? One very large node in a cluster prevents balancer from balancing data -- Key: HDFS-5958 URL: https://issues.apache.org/jira/browse/HDFS-5958 Project: Hadoop HDFS Issue Type: Bug Components: balancer mover Affects Versions: 2.2.0 Environment: Hadoop cluster with 4 nodes: 3 with 500Gb drives and one with 4Tb drive. Reporter: Alexey Kovyrin In a cluster with a set of small nodes and one much larger node balancer always selects the large node as the target even though it already has a copy of each block in the cluster. This causes the balancer to enter an infinite loop and stop balancing other nodes because each balancing iteration selects the same target and then could not find a single block to move. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HDFS-1676) DateFormat.getDateTimeInstance() is very expensive, we can cache it to improve performance
[ https://issues.apache.org/jira/browse/HDFS-1676?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo Nicholas Sze resolved HDFS-1676. --- Resolution: Not A Problem Resolving this as not-a-problem. Please feel free to reopen if you disagree. DateFormat.getDateTimeInstance() is very expensive, we can cache it to improve performance -- Key: HDFS-1676 URL: https://issues.apache.org/jira/browse/HDFS-1676 Project: Hadoop HDFS Issue Type: Bug Components: balancer mover Affects Versions: 0.21.0 Reporter: Xiaoming Shi Labels: newbie In the file: ./hadoop-0.21.0/hdfs/src/java/org/apache/hadoop/hdfs/server/balancer/Balancer.java line:1520 In the while loop, DateFormat.getDateTimeInstance()is called in each iteration. We can cache the result by moving it outside the loop or adding a class member. This is similar to the Apache bug https://issues.apache.org/bugzilla/show_bug.cgi?id=48778 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-3619) isGoodBlockCandidate() in Balancer is not handling properly if replica factor 3
[ https://issues.apache.org/jira/browse/HDFS-3619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14643679#comment-14643679 ] Tsz Wo Nicholas Sze commented on HDFS-3619: --- Does this constraint looks too strong so that slightly in-consistence with BlockPlacementPolicyDefault? Yes, Balancer has a stronger constraint. Let's keep it like this so that it can easily support a different BlockPlacementPolicy. isGoodBlockCandidate() in Balancer is not handling properly if replica factor 3 Key: HDFS-3619 URL: https://issues.apache.org/jira/browse/HDFS-3619 Project: Hadoop HDFS Issue Type: Bug Components: balancer mover Affects Versions: 1.0.0, 2.0.0-alpha Reporter: Junping Du Assignee: Junping Du Let's assume: 1. replica factor = 4 2. source node in rack 1 has 1st replica, 2nd and 3rd replica are in rack 2, 4th replica in rack3 and target node is in rack3. So, It should be good for balancer to move replica from source node to target node but will return false in isGoodBlockCandidate(). I think we can fix it by simply making judgement that at least one replica node (other than source) is on the different rack of target node. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HDFS-3619) isGoodBlockCandidate() in Balancer is not handling properly if replica factor 3
[ https://issues.apache.org/jira/browse/HDFS-3619?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo Nicholas Sze resolved HDFS-3619. --- Resolution: Not A Problem Resolving as not-a-problem. Please feel free to reopen if you disagree. isGoodBlockCandidate() in Balancer is not handling properly if replica factor 3 Key: HDFS-3619 URL: https://issues.apache.org/jira/browse/HDFS-3619 Project: Hadoop HDFS Issue Type: Bug Components: balancer mover Affects Versions: 1.0.0, 2.0.0-alpha Reporter: Junping Du Assignee: Junping Du Let's assume: 1. replica factor = 4 2. source node in rack 1 has 1st replica, 2nd and 3rd replica are in rack 2, 4th replica in rack3 and target node is in rack3. So, It should be good for balancer to move replica from source node to target node but will return false in isGoodBlockCandidate(). I think we can fix it by simply making judgement that at least one replica node (other than source) is on the different rack of target node. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-3570) Balancer shouldn't rely on DFS Space Used % as that ignores non-DFS used space
[ https://issues.apache.org/jira/browse/HDFS-3570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14643695#comment-14643695 ] Tsz Wo Nicholas Sze commented on HDFS-3570: --- Have you set dfs.datanode.du.reserved for the non-dfs used space? Balancer shouldn't rely on DFS Space Used % as that ignores non-DFS used space Key: HDFS-3570 URL: https://issues.apache.org/jira/browse/HDFS-3570 Project: Hadoop HDFS Issue Type: Bug Components: balancer mover Affects Versions: 2.0.0-alpha Reporter: Harsh J Assignee: Akira AJISAKA Priority: Minor Attachments: HDFS-3570.003.patch, HDFS-3570.2.patch, HDFS-3570.aash.1.patch Report from a user here: https://groups.google.com/a/cloudera.org/d/msg/cdh-user/pIhNyDVxdVY/b7ENZmEvBjIJ, post archived at http://pastebin.com/eVFkk0A0 This user had a specific DN that had a large non-DFS usage among dfs.data.dirs, and very little DFS usage (which is computed against total possible capacity). Balancer apparently only looks at the usage, and ignores to consider that non-DFS usage may also be high on a DN/cluster. Hence, it thinks that if a DFS Usage report from DN is 8% only, its got a lot of free space to write more blocks, when that isn't true as shown by the case of this user. It went on scheduling writes to the DN to balance it out, but the DN simply can't accept any more blocks as a result of its disks' state. I think it would be better if we _computed_ the actual utilization based on {{(100-(actual remaining space))/(capacity)}}, as opposed to the current {{(dfs used)/(capacity)}}. Thoughts? This isn't very critical, however, cause it is very rare to see DN space being used for non DN data, but it does expose a valid bug. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7858) Improve HA Namenode Failover detection on the client
[ https://issues.apache.org/jira/browse/HDFS-7858?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun Suresh updated HDFS-7858: -- Attachment: HDFS-7858.13.patch The testcase failures seem spurious. Attaching a patch to fix the javac warnings and cleaning up some imports. Thanks for the reviews [~jingzhao], [~arpitagarwal], [~atm] [~bikassaha]. Will be committing after the next jenkins run. Improve HA Namenode Failover detection on the client Key: HDFS-7858 URL: https://issues.apache.org/jira/browse/HDFS-7858 Project: Hadoop HDFS Issue Type: Improvement Reporter: Arun Suresh Assignee: Arun Suresh Labels: BB2015-05-TBR Attachments: HDFS-7858.1.patch, HDFS-7858.10.patch, HDFS-7858.10.patch, HDFS-7858.11.patch, HDFS-7858.12.patch, HDFS-7858.13.patch, HDFS-7858.2.patch, HDFS-7858.2.patch, HDFS-7858.3.patch, HDFS-7858.4.patch, HDFS-7858.5.patch, HDFS-7858.6.patch, HDFS-7858.7.patch, HDFS-7858.8.patch, HDFS-7858.9.patch In an HA deployment, Clients are configured with the hostnames of both the Active and Standby Namenodes.Clients will first try one of the NNs (non-deterministically) and if its a standby NN, then it will respond to the client to retry the request on the other Namenode. If the client happens to talks to the Standby first, and the standby is undergoing some GC / is busy, then those clients might not get a response soon enough to try the other NN. Proposed Approach to solve this : 1) Since Zookeeper is already used as the failover controller, the clients could talk to ZK and find out which is the active namenode before contacting it. 2) Long-lived DFSClients would have a ZK watch configured which fires when there is a failover so they do not have to query ZK everytime to find out the active NN 2) Clients can also cache the last active NN in the user's home directory (~/.lastNN) so that short-lived clients can try that Namenode first before querying ZK -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8287) DFSStripedOutputStream.writeChunk should not wait for writing parity
[ https://issues.apache.org/jira/browse/HDFS-8287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14643733#comment-14643733 ] Tsz Wo Nicholas Sze commented on HDFS-8287: --- I think we should have double buffering. In the beginning, user client writes data to the first buffer. When the first cell strip is full, user client continues writing to the second buffer. At the same time, the fastest parity streamer picks up the first buffer and computes the parity. Once all parity have been computed, the other parity streamers enqueue the parity packets. Once all parity packet are enqueued, the first buffer can be released so that when the second buffer is full, the user client can continue write to the first buffer again. DFSStripedOutputStream.writeChunk should not wait for writing parity - Key: HDFS-8287 URL: https://issues.apache.org/jira/browse/HDFS-8287 Project: Hadoop HDFS Issue Type: Sub-task Components: hdfs-client Reporter: Tsz Wo Nicholas Sze Assignee: Kai Sasaki When a stripping cell is full, writeChunk computes and generates parity packets. It sequentially calls waitAndQueuePacket so that user client cannot continue to write data until it finishes. We should allow user client to continue writing instead but not blocking it when writing parity. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8499) Refactor BlockInfo class hierarchy with static helper class
[ https://issues.apache.org/jira/browse/HDFS-8499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14643747#comment-14643747 ] Tsz Wo Nicholas Sze commented on HDFS-8499: --- I was pointing out that BlockInfoUnderConstruction itself no longer is-a BlockInfo. That changes how NN uses BlockInfoUnderConstruction. ... Could you give some examples? In the beginning of the project we chose to extend BlockInfo to handle block groups, instead of building something like BlockGroupInfo from scratch. One of the most important reasons was that most block mgmt logics (including UC logic) are orthogonal to striping logic. ... You seems saying BlockInfo could handle both contiguous and striped blocks. No? Actually, we extend BlockInfo to BlockInfoContiguous and BlockInfoStriped so that contiguous blocks and striped blocks can be handled differently. Similarly for UC, we want to handle contiguous and striped uc blocks differently. Anyway, it does not make sense to handle the contiguous block logic differently in BlockInfoContiguous and BlockInfoContiguousUC, (and handle the striped block logic differently in BlockInfoStriped and BlockInfoStripedUC.) So that we don't want to choose Design #1. In the reworked structure, ... We don't need to rework anything. It is already done in the HDFS-7285 branch. I think we should simply revert the HDFS-8499 and the related patches. BTW, HDFS-8499 and the related patches are some bigger change that we probably should not commit to trunk in the first place before the merging HDFS-7285. Refactor BlockInfo class hierarchy with static helper class --- Key: HDFS-8499 URL: https://issues.apache.org/jira/browse/HDFS-8499 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode Affects Versions: 2.7.0 Reporter: Zhe Zhang Assignee: Zhe Zhang Fix For: 2.8.0 Attachments: HDFS-8499.00.patch, HDFS-8499.01.patch, HDFS-8499.02.patch, HDFS-8499.03.patch, HDFS-8499.04.patch, HDFS-8499.05.patch, HDFS-8499.06.patch, HDFS-8499.07.patch, HDFS-8499.UCFeature.patch, HDFS-bistriped.patch In HDFS-7285 branch, the {{BlockInfoUnderConstruction}} interface provides a common abstraction for striped and contiguous UC blocks. This JIRA aims to merge it to trunk. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8824) Do not use small blocks for balancing the cluster
[ https://issues.apache.org/jira/browse/HDFS-8824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo Nicholas Sze updated HDFS-8824: -- Issue Type: Sub-task (was: Improvement) Parent: HDFS-8825 Do not use small blocks for balancing the cluster - Key: HDFS-8824 URL: https://issues.apache.org/jira/browse/HDFS-8824 Project: Hadoop HDFS Issue Type: Sub-task Components: balancer mover Reporter: Tsz Wo Nicholas Sze Assignee: Tsz Wo Nicholas Sze Attachments: h8824_20150727b.patch Balancer gets datanode block lists from NN and then move the blocks in order to balance the cluster. It should not use the blocks with small size since moving the small blocks generates a lot of overhead and the small blocks do not help balancing the cluster much. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8344) NameNode doesn't recover lease for files with missing blocks
[ https://issues.apache.org/jira/browse/HDFS-8344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14643645#comment-14643645 ] Haohui Mai commented on HDFS-8344: -- Sorry for the delay. Will do it in the next couple days. NameNode doesn't recover lease for files with missing blocks Key: HDFS-8344 URL: https://issues.apache.org/jira/browse/HDFS-8344 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.7.0 Reporter: Ravi Prakash Assignee: Ravi Prakash Fix For: 2.8.0 Attachments: HDFS-8344.01.patch, HDFS-8344.02.patch, HDFS-8344.03.patch, HDFS-8344.04.patch, HDFS-8344.05.patch, HDFS-8344.06.patch, HDFS-8344.07.patch, HDFS-8344.08.patch I found another\(?) instance in which the lease is not recovered. This is reproducible easily on a pseudo-distributed single node cluster # Before you start it helps if you set. This is not necessary, but simply reduces how long you have to wait {code} public static final long LEASE_SOFTLIMIT_PERIOD = 30 * 1000; public static final long LEASE_HARDLIMIT_PERIOD = 2 * LEASE_SOFTLIMIT_PERIOD; {code} # Client starts to write a file. (could be less than 1 block, but it hflushed so some of the data has landed on the datanodes) (I'm copying the client code I am using. I generate a jar and run it using $ hadoop jar TestHadoop.jar) # Client crashes. (I simulate this by kill -9 the $(hadoop jar TestHadoop.jar) process after it has printed Wrote to the bufferedWriter # Shoot the datanode. (Since I ran on a pseudo-distributed cluster, there was only 1) I believe the lease should be recovered and the block should be marked missing. However this is not happening. The lease is never recovered. The effect of this bug for us was that nodes could not be decommissioned cleanly. Although we knew that the client had crashed, the Namenode never released the leases (even after restarting the Namenode) (even months afterwards). There are actually several other cases too where we don't consider what happens if ALL the datanodes die while the file is being written, but I am going to punt on that for another time. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8818) Allow Balancer to run faster
[ https://issues.apache.org/jira/browse/HDFS-8818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo Nicholas Sze updated HDFS-8818: -- Issue Type: Sub-task (was: Improvement) Parent: HDFS-8825 Allow Balancer to run faster Key: HDFS-8818 URL: https://issues.apache.org/jira/browse/HDFS-8818 Project: Hadoop HDFS Issue Type: Sub-task Components: balancer mover Reporter: Tsz Wo Nicholas Sze Assignee: Tsz Wo Nicholas Sze Attachments: h8818_20150723.patch, h8818_20150727.patch The original design of Balancer is intentionally to make it run slowly so that the balancing activities won't affect the normal cluster activities and the running jobs. There are new use case that cluster admin may choose to balance the cluster when the cluster load is low, or in a maintain window. So that we should have an option to allow Balancer to run faster. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-742) A down DataNode makes Balancer to hang on repeatingly asking NameNode its partial block list
[ https://issues.apache.org/jira/browse/HDFS-742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14643655#comment-14643655 ] Tsz Wo Nicholas Sze commented on HDFS-742: -- [~mitdesai], sorry for the late review. The patch looks good. Could you update the patch with current trunk? A down DataNode makes Balancer to hang on repeatingly asking NameNode its partial block list Key: HDFS-742 URL: https://issues.apache.org/jira/browse/HDFS-742 Project: Hadoop HDFS Issue Type: Bug Components: balancer mover Reporter: Hairong Kuang Assignee: Mit Desai Attachments: HDFS-742.patch We had a balancer that had not made any progress for a long time. It turned out it was repeatingly asking Namenode for a partial block list of one datanode, which was done while the balancer was running. NameNode should notify Balancer that the datanode is not available and Balancer should stop asking for the datanode's block list. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-3411) Balancer fails to balance blocks between aboveAvgUtilized and belowAvgUtilized datanodes.
[ https://issues.apache.org/jira/browse/HDFS-3411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14643677#comment-14643677 ] Tsz Wo Nicholas Sze commented on HDFS-3411: --- {quote} Now DN1 is added into aboveAvgUtilizedDatanodes and DN2 into belowAvgUtilizedDatanodes. Hence overLoadedBytes and underLoadedBytes will be equal to 0. Resulting in bytesLeftToMove equal to 0. Thus balancer will exit without balancing the blocks. {quote} Everything works as expected. Nothing to balancer since both DN1 and DN2 are within threshold. Balancer fails to balance blocks between aboveAvgUtilized and belowAvgUtilized datanodes. - Key: HDFS-3411 URL: https://issues.apache.org/jira/browse/HDFS-3411 Project: Hadoop HDFS Issue Type: Bug Components: balancer mover Affects Versions: 0.23.0 Reporter: Ashish Singhi Scaenario: replication set to 1. 1. Start 1NN and IDN 2. pump 1GB of data. 3. Start one more DN 4. Run balancer with threshold 1. Now DN1 is added into aboveAvgUtilizedDatanodes and DN2 into belowAvgUtilizedDatanodes. Hence overLoadedBytes and underLoadedBytes will be equal to 0. Resulting in bytesLeftToMove equal to 0. Thus balancer will exit without balancing the blocks. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HDFS-3411) Balancer fails to balance blocks between aboveAvgUtilized and belowAvgUtilized datanodes.
[ https://issues.apache.org/jira/browse/HDFS-3411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo Nicholas Sze resolved HDFS-3411. --- Resolution: Not A Problem Resolving as not-a-problem. Please feel free to reopen if you disagree. Balancer fails to balance blocks between aboveAvgUtilized and belowAvgUtilized datanodes. - Key: HDFS-3411 URL: https://issues.apache.org/jira/browse/HDFS-3411 Project: Hadoop HDFS Issue Type: Bug Components: balancer mover Affects Versions: 0.23.0 Reporter: Ashish Singhi Scaenario: replication set to 1. 1. Start 1NN and IDN 2. pump 1GB of data. 3. Start one more DN 4. Run balancer with threshold 1. Now DN1 is added into aboveAvgUtilizedDatanodes and DN2 into belowAvgUtilizedDatanodes. Hence overLoadedBytes and underLoadedBytes will be equal to 0. Resulting in bytesLeftToMove equal to 0. Thus balancer will exit without balancing the blocks. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-4664) HDFS for heterogeneous environment
[ https://issues.apache.org/jira/browse/HDFS-4664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo Nicholas Sze updated HDFS-4664: -- Component/s: (was: balancer mover) HDFS for heterogeneous environment -- Key: HDFS-4664 URL: https://issues.apache.org/jira/browse/HDFS-4664 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 0.20.2 Environment: Ubuntu Linux, Institutional lab Reporter: Mohammad Mustaqeem I want to use HDFS for storing the files in the institutional labs. Here the point is to be noted that all the nodes in the labs are not of same type that is some nodes stay on for longer duration while some for small duration. In addition to this all the labs are not same means that some lab has UPS facility and some has more nodes. If I consider the lab as rack, then we should not choose the racks and node randomly in replica placement. We should give more priority to those nodes that stay on for longer duration and to those lab which has lab facility and has more system. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7639) Remove the limitation imposed by dfs.balancer.moverThreads
[ https://issues.apache.org/jira/browse/HDFS-7639?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo Nicholas Sze updated HDFS-7639: -- Issue Type: Sub-task (was: Improvement) Parent: HDFS-8825 Remove the limitation imposed by dfs.balancer.moverThreads -- Key: HDFS-7639 URL: https://issues.apache.org/jira/browse/HDFS-7639 Project: Hadoop HDFS Issue Type: Sub-task Components: balancer mover Reporter: Tsz Wo Nicholas Sze Assignee: Chen He In Balancer/Mover, the number of dispatcher threads (dfs.balancer.moverThreads) limits the number of concurrent moves. Each dispatcher thread sends request to a datanode and then is blocked for waiting the response. We should remove such limitation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8278) HDFS Balancer should consider remaining storage % when checking for under-utilized machines
[ https://issues.apache.org/jira/browse/HDFS-8278?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo Nicholas Sze updated HDFS-8278: -- Issue Type: Sub-task (was: Bug) Parent: HDFS-8825 HDFS Balancer should consider remaining storage % when checking for under-utilized machines --- Key: HDFS-8278 URL: https://issues.apache.org/jira/browse/HDFS-8278 Project: Hadoop HDFS Issue Type: Sub-task Components: balancer mover Affects Versions: 2.8.0 Reporter: Gopal V Assignee: Tsz Wo Nicholas Sze DFS balancer mistakenly identifies a node with very little storage space remaining as an underutilized node and tries to move large amounts of data to that particular node. All these block moves fail to execute successfully, as the % utilization is less relevant than the dfs remaining storage on that node. {code} 15/04/24 04:25:55 INFO balancer.Balancer: 0 over-utilized: [] 15/04/24 04:25:55 INFO balancer.Balancer: 1 underutilized: [172.19.1.46:50010:DISK] 15/04/24 04:25:55 INFO balancer.Balancer: Need to move 47.68 GB to make the cluster balanced. 15/04/24 04:25:55 INFO balancer.Balancer: Decided to move 413.08 MB bytes from 172.19.1.52:50010:DISK to 172.19.1.46:50010:DISK 15/04/24 04:25:55 INFO balancer.Balancer: Will move 413.08 MB in this iteration 15/04/24 04:25:55 WARN balancer.Dispatcher: Failed to move blk_1078689321_1099517353638 with size=131146 from 172.19.1.52:50010:DISK to 172.19.1.46:50010:DISK through 172.19.1.53:50010: Got error, status message opReplaceBlock BP-942051088-172.18.1.41-1370508013893:blk_1078689321_1099517353638 received exception org.apache.hadoop.util.DiskChecker$DiskOutOfSpaceException: Out of space: The volume with the most available space (=225042432 B) is less than the block size (=268435456 B)., block move is failed {code} The machine in concern is under-full when it comes to the BP utilization, but has very little free space available for blocks. {code} Decommission Status : Normal Configured Capacity: 3826907185152 (3.48 TB) DFS Used: 2817262833664 (2.56 TB) Non DFS Used: 1000621305856 (931.90 GB) DFS Remaining: 9023045632 (8.40 GB) DFS Used%: 73.62% DFS Remaining%: 0.24% Configured Cache Capacity: 8589934592 (8 GB) Cache Used: 0 (0 B) Cache Remaining: 8589934592 (8 GB) Cache Used%: 0.00% Cache Remaining%: 100.00% Xceivers: 3 Last contact: Fri Apr 24 04:28:36 PDT 2015 {code} The machine has 0.40 Gb of non-RAM storage available on that node, so it is futile to attempt to move any blocks to that particular machine. This is a similar concern when a machine loses disks, since the comparisons of utilization always compare percentages per-node. Even that scenario needs to cap data movement to that node to the DFS Remaining % variable. Trying to move any more data than that to a given node will always fail. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8287) DFSStripedOutputStream.writeChunk should not wait for writing parity
[ https://issues.apache.org/jira/browse/HDFS-8287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14643743#comment-14643743 ] Kai Sasaki commented on HDFS-8287: -- [~szetszwo] Thank you so much for detailed instruction! DFSStripedOutputStream.writeChunk should not wait for writing parity - Key: HDFS-8287 URL: https://issues.apache.org/jira/browse/HDFS-8287 Project: Hadoop HDFS Issue Type: Sub-task Components: hdfs-client Reporter: Tsz Wo Nicholas Sze Assignee: Kai Sasaki When a stripping cell is full, writeChunk computes and generates parity packets. It sequentially calls waitAndQueuePacket so that user client cannot continue to write data until it finishes. We should allow user client to continue writing instead but not blocking it when writing parity. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-5958) One very large node in a cluster prevents balancer from balancing data
[ https://issues.apache.org/jira/browse/HDFS-5958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14643755#comment-14643755 ] Tsz Wo Nicholas Sze commented on HDFS-5958: --- BTW, your case may be similar to HDFS-8826. One very large node in a cluster prevents balancer from balancing data -- Key: HDFS-5958 URL: https://issues.apache.org/jira/browse/HDFS-5958 Project: Hadoop HDFS Issue Type: Bug Components: balancer mover Affects Versions: 2.2.0 Environment: Hadoop cluster with 4 nodes: 3 with 500Gb drives and one with 4Tb drive. Reporter: Alexey Kovyrin In a cluster with a set of small nodes and one much larger node balancer always selects the large node as the target even though it already has a copy of each block in the cluster. This causes the balancer to enter an infinite loop and stop balancing other nodes because each balancing iteration selects the same target and then could not find a single block to move. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8822) Add SSD storagepolicy tests in TestBlockStoragePolicy#testDefaultPolicies
[ https://issues.apache.org/jira/browse/HDFS-8822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14642558#comment-14642558 ] Hadoop QA commented on HDFS-8822: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 8m 28s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 1 new or modified test files. | | {color:green}+1{color} | javac | 8m 5s | There were no new javac warning messages. | | {color:green}+1{color} | release audit | 0m 20s | The applied patch does not increase the total number of release audit warnings. | | {color:green}+1{color} | checkstyle | 1m 29s | There were no new checkstyle issues. | | {color:green}+1{color} | whitespace | 0m 0s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 19s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 31s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 2m 33s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | native | 1m 8s | Pre-build of native portion | | {color:red}-1{color} | hdfs tests | 161m 2s | Tests failed in hadoop-hdfs. | | | | 184m 58s | | \\ \\ || Reason || Tests || | Failed unit tests | hadoop.hdfs.TestDistributedFileSystem | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12747296/HDFS-8822-01.patch | | Optional Tests | javac unit findbugs checkstyle | | git revision | trunk / 1df7868 | | hadoop-hdfs test log | https://builds.apache.org/job/PreCommit-HDFS-Build/11845/artifact/patchprocess/testrun_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/11845/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf904.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/11845/console | This message was automatically generated. Add SSD storagepolicy tests in TestBlockStoragePolicy#testDefaultPolicies - Key: HDFS-8822 URL: https://issues.apache.org/jira/browse/HDFS-8822 Project: Hadoop HDFS Issue Type: Improvement Reporter: Vinayakumar B Assignee: Vinayakumar B Attachments: HDFS-8822-01.patch Add tests for storage policies ALLSSD and ONESSD in {{TestBlockStoragePolicy#testDefaultPolicies(..)}} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8769) Erasure Coding: unit test for SequentialBlockGroupIdGenerator
[ https://issues.apache.org/jira/browse/HDFS-8769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14642563#comment-14642563 ] Rakesh R commented on HDFS-8769: Thank you [~walter.k.su] for reviewing and committing the changes! Erasure Coding: unit test for SequentialBlockGroupIdGenerator - Key: HDFS-8769 URL: https://issues.apache.org/jira/browse/HDFS-8769 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Walter Su Assignee: Rakesh R Fix For: HDFS-7285 Attachments: HDFS-8769-HDFS-7285-00.patch, HDFS-8769-HDFS-7285-01.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8499) Refactor BlockInfo class hierarchy with static helper class
[ https://issues.apache.org/jira/browse/HDFS-8499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14643866#comment-14643866 ] Zhe Zhang commented on HDFS-8499: - To be able to proceed with EC branch merging I'm OK with reverting this patch as an intermediate solution. Reopening this JIRA to investigate the most suitable solution to the {{BlockInfo}} multi-inheritancy problem. Refactor BlockInfo class hierarchy with static helper class --- Key: HDFS-8499 URL: https://issues.apache.org/jira/browse/HDFS-8499 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode Affects Versions: 2.7.0 Reporter: Zhe Zhang Assignee: Zhe Zhang Fix For: 2.8.0 Attachments: HDFS-8499.00.patch, HDFS-8499.01.patch, HDFS-8499.02.patch, HDFS-8499.03.patch, HDFS-8499.04.patch, HDFS-8499.05.patch, HDFS-8499.06.patch, HDFS-8499.07.patch, HDFS-8499.UCFeature.patch, HDFS-bistriped.patch In HDFS-7285 branch, the {{BlockInfoUnderConstruction}} interface provides a common abstraction for striped and contiguous UC blocks. This JIRA aims to merge it to trunk. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Reopened] (HDFS-8499) Refactor BlockInfo class hierarchy with static helper class
[ https://issues.apache.org/jira/browse/HDFS-8499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhe Zhang reopened HDFS-8499: - Refactor BlockInfo class hierarchy with static helper class --- Key: HDFS-8499 URL: https://issues.apache.org/jira/browse/HDFS-8499 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode Affects Versions: 2.7.0 Reporter: Zhe Zhang Assignee: Zhe Zhang Fix For: 2.8.0 Attachments: HDFS-8499.00.patch, HDFS-8499.01.patch, HDFS-8499.02.patch, HDFS-8499.03.patch, HDFS-8499.04.patch, HDFS-8499.05.patch, HDFS-8499.06.patch, HDFS-8499.07.patch, HDFS-8499.UCFeature.patch, HDFS-bistriped.patch In HDFS-7285 branch, the {{BlockInfoUnderConstruction}} interface provides a common abstraction for striped and contiguous UC blocks. This JIRA aims to merge it to trunk. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7858) Improve HA Namenode Failover detection on the client
[ https://issues.apache.org/jira/browse/HDFS-7858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14643869#comment-14643869 ] Hadoop QA commented on HDFS-7858: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 22m 33s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 1 new or modified test files. | | {color:red}-1{color} | javac | 7m 38s | The applied patch generated 2 additional warning messages. | | {color:green}+1{color} | javadoc | 9m 51s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 23s | The applied patch does not increase the total number of release audit warnings. | | {color:green}+1{color} | site | 2m 57s | Site still builds. | | {color:green}+1{color} | checkstyle | 2m 2s | There were no new checkstyle issues. | | {color:green}+1{color} | whitespace | 0m 0s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 27s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 32s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 4m 24s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | common tests | 22m 20s | Tests passed in hadoop-common. | | {color:red}-1{color} | hdfs tests | 158m 53s | Tests failed in hadoop-hdfs. | | | | 233m 4s | | \\ \\ || Reason || Tests || | Failed unit tests | hadoop.hdfs.server.namenode.ha.TestRequestHedgingProxyProvider | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12747474/HDFS-7858.13.patch | | Optional Tests | javadoc javac unit findbugs checkstyle site | | git revision | trunk / 3572ebd | | javac | https://builds.apache.org/job/PreCommit-HDFS-Build/11852/artifact/patchprocess/diffJavacWarnings.txt | | hadoop-common test log | https://builds.apache.org/job/PreCommit-HDFS-Build/11852/artifact/patchprocess/testrun_hadoop-common.txt | | hadoop-hdfs test log | https://builds.apache.org/job/PreCommit-HDFS-Build/11852/artifact/patchprocess/testrun_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/11852/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf904.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/11852/console | This message was automatically generated. Improve HA Namenode Failover detection on the client Key: HDFS-7858 URL: https://issues.apache.org/jira/browse/HDFS-7858 Project: Hadoop HDFS Issue Type: Improvement Reporter: Arun Suresh Assignee: Arun Suresh Labels: BB2015-05-TBR Attachments: HDFS-7858.1.patch, HDFS-7858.10.patch, HDFS-7858.10.patch, HDFS-7858.11.patch, HDFS-7858.12.patch, HDFS-7858.13.patch, HDFS-7858.2.patch, HDFS-7858.2.patch, HDFS-7858.3.patch, HDFS-7858.4.patch, HDFS-7858.5.patch, HDFS-7858.6.patch, HDFS-7858.7.patch, HDFS-7858.8.patch, HDFS-7858.9.patch In an HA deployment, Clients are configured with the hostnames of both the Active and Standby Namenodes.Clients will first try one of the NNs (non-deterministically) and if its a standby NN, then it will respond to the client to retry the request on the other Namenode. If the client happens to talks to the Standby first, and the standby is undergoing some GC / is busy, then those clients might not get a response soon enough to try the other NN. Proposed Approach to solve this : 1) Since Zookeeper is already used as the failover controller, the clients could talk to ZK and find out which is the active namenode before contacting it. 2) Long-lived DFSClients would have a ZK watch configured which fires when there is a failover so they do not have to query ZK everytime to find out the active NN 2) Clients can also cache the last active NN in the user's home directory (~/.lastNN) so that short-lived clients can try that Namenode first before querying ZK -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-8827) Erasure Coding: When namenode processes over replicated striped block, NPE will be occur in ReplicationMonitor
Takuya Fukudome created HDFS-8827: - Summary: Erasure Coding: When namenode processes over replicated striped block, NPE will be occur in ReplicationMonitor Key: HDFS-8827 URL: https://issues.apache.org/jira/browse/HDFS-8827 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Takuya Fukudome Assignee: Takuya Fukudome In our test cluster, when namenode processed over replicated striped blocks, null pointer exception(NPE) occurred. This happened under below situation: 1) some datanodes shutdown. 2) namenode recovers block group which lost internal blocks. 3) restart the stopped datanodes. 4) namenode processes over replicated striped blocks. 5) NPE occurs I think BlockPlacementPolicyDefault#chooseReplicaToDelete will return null in this situation which causes this NPE problem. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8827) Erasure Coding: When namenode processes over replicated striped block, NPE will be occur in ReplicationMonitor
[ https://issues.apache.org/jira/browse/HDFS-8827?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Takuya Fukudome updated HDFS-8827: -- Attachment: processing-over-replica-npe.log The log which includes null pointer exception message was attached. Erasure Coding: When namenode processes over replicated striped block, NPE will be occur in ReplicationMonitor -- Key: HDFS-8827 URL: https://issues.apache.org/jira/browse/HDFS-8827 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Takuya Fukudome Assignee: Takuya Fukudome Attachments: processing-over-replica-npe.log In our test cluster, when namenode processed over replicated striped blocks, null pointer exception(NPE) occurred. This happened under below situation: 1) some datanodes shutdown. 2) namenode recovers block group which lost internal blocks. 3) restart the stopped datanodes. 4) namenode processes over replicated striped blocks. 5) NPE occurs I think BlockPlacementPolicyDefault#chooseReplicaToDelete will return null in this situation which causes this NPE problem. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-3570) Balancer shouldn't rely on DFS Space Used % as that ignores non-DFS used space
[ https://issues.apache.org/jira/browse/HDFS-3570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14643825#comment-14643825 ] Akira AJISAKA commented on HDFS-3570: - bq. Have you set dfs.datanode.du.reserved for the non-dfs used space? I don't set the parameter. Setting the parameter for non-dfs used space is an ideal way to avoid the problem, however, I'd like to deal with such a situation that someone unintentionally puts big files to a DataNode and then another one runs balancer. Balancer shouldn't rely on DFS Space Used % as that ignores non-DFS used space Key: HDFS-3570 URL: https://issues.apache.org/jira/browse/HDFS-3570 Project: Hadoop HDFS Issue Type: Bug Components: balancer mover Affects Versions: 2.0.0-alpha Reporter: Harsh J Assignee: Akira AJISAKA Priority: Minor Attachments: HDFS-3570.003.patch, HDFS-3570.2.patch, HDFS-3570.aash.1.patch Report from a user here: https://groups.google.com/a/cloudera.org/d/msg/cdh-user/pIhNyDVxdVY/b7ENZmEvBjIJ, post archived at http://pastebin.com/eVFkk0A0 This user had a specific DN that had a large non-DFS usage among dfs.data.dirs, and very little DFS usage (which is computed against total possible capacity). Balancer apparently only looks at the usage, and ignores to consider that non-DFS usage may also be high on a DN/cluster. Hence, it thinks that if a DFS Usage report from DN is 8% only, its got a lot of free space to write more blocks, when that isn't true as shown by the case of this user. It went on scheduling writes to the DN to balance it out, but the DN simply can't accept any more blocks as a result of its disks' state. I think it would be better if we _computed_ the actual utilization based on {{(100-(actual remaining space))/(capacity)}}, as opposed to the current {{(dfs used)/(capacity)}}. Thoughts? This isn't very critical, however, cause it is very rare to see DN space being used for non DN data, but it does expose a valid bug. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8824) Do not use small blocks for balancing the cluster
[ https://issues.apache.org/jira/browse/HDFS-8824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14643834#comment-14643834 ] Hadoop QA commented on HDFS-8824: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 17m 17s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 1 new or modified test files. | | {color:green}+1{color} | javac | 7m 40s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 9m 39s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 23s | The applied patch does not increase the total number of release audit warnings. | | {color:red}-1{color} | checkstyle | 1m 26s | The applied patch generated 12 new checkstyle issues (total was 786, now 792). | | {color:green}+1{color} | whitespace | 0m 1s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 22s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 33s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 2m 34s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | native | 3m 3s | Pre-build of native portion | | {color:red}-1{color} | hdfs tests | 221m 38s | Tests failed in hadoop-hdfs. | | | | 265m 39s | | \\ \\ || Reason || Tests || | Failed unit tests | hadoop.hdfs.TestFileStatus | | | hadoop.hdfs.server.balancer.TestBalancerWithHANameNodes | | | hadoop.hdfs.TestReadWhileWriting | | | hadoop.hdfs.TestFSOutputSummer | | | hadoop.hdfs.TestParallelShortCircuitLegacyRead | | | hadoop.hdfs.TestAppendSnapshotTruncate | | | hadoop.hdfs.server.balancer.TestBalancerWithEncryptedTransfer | | | hadoop.hdfs.TestDFSClientRetries | | | hadoop.hdfs.server.balancer.TestBalancerWithNodeGroup | | Timed out tests | org.apache.hadoop.hdfs.server.blockmanagement.TestBlockTokenWithDFS | | | org.apache.hadoop.hdfs.server.balancer.TestBalancerWithMultipleNameNodes | | | org.apache.hadoop.hdfs.server.balancer.TestBalancerWithSaslDataTransfer | | | org.apache.hadoop.hdfs.server.balancer.TestBalancer | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12747457/h8824_20150727b.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / 3e6fce9 | | checkstyle | https://builds.apache.org/job/PreCommit-HDFS-Build/11850/artifact/patchprocess/diffcheckstylehadoop-hdfs.txt | | hadoop-hdfs test log | https://builds.apache.org/job/PreCommit-HDFS-Build/11850/artifact/patchprocess/testrun_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/11850/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf900.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/11850/console | This message was automatically generated. Do not use small blocks for balancing the cluster - Key: HDFS-8824 URL: https://issues.apache.org/jira/browse/HDFS-8824 Project: Hadoop HDFS Issue Type: Sub-task Components: balancer mover Reporter: Tsz Wo Nicholas Sze Assignee: Tsz Wo Nicholas Sze Attachments: h8824_20150727b.patch Balancer gets datanode block lists from NN and then move the blocks in order to balance the cluster. It should not use the blocks with small size since moving the small blocks generates a lot of overhead and the small blocks do not help balancing the cluster much. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8818) Allow Balancer to run faster
[ https://issues.apache.org/jira/browse/HDFS-8818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14643788#comment-14643788 ] Hadoop QA commented on HDFS-8818: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 17m 0s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 1 new or modified test files. | | {color:green}+1{color} | javac | 7m 36s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 9m 34s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 23s | The applied patch does not increase the total number of release audit warnings. | | {color:red}-1{color} | checkstyle | 1m 20s | The applied patch generated 7 new checkstyle issues (total was 525, now 531). | | {color:green}+1{color} | whitespace | 0m 0s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 20s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 33s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 2m 32s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | native | 3m 8s | Pre-build of native portion | | {color:red}-1{color} | hdfs tests | 159m 31s | Tests failed in hadoop-hdfs. | | | | 203m 2s | | \\ \\ || Reason || Tests || | Failed unit tests | hadoop.hdfs.TestRollingUpgrade | | | hadoop.hdfs.server.namenode.ha.TestStandbyIsHot | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12747448/h8818_20150727.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / 3e6fce9 | | checkstyle | https://builds.apache.org/job/PreCommit-HDFS-Build/11849/artifact/patchprocess/diffcheckstylehadoop-hdfs.txt | | hadoop-hdfs test log | https://builds.apache.org/job/PreCommit-HDFS-Build/11849/artifact/patchprocess/testrun_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/11849/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf909.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/11849/console | This message was automatically generated. Allow Balancer to run faster Key: HDFS-8818 URL: https://issues.apache.org/jira/browse/HDFS-8818 Project: Hadoop HDFS Issue Type: Sub-task Components: balancer mover Reporter: Tsz Wo Nicholas Sze Assignee: Tsz Wo Nicholas Sze Attachments: h8818_20150723.patch, h8818_20150727.patch The original design of Balancer is intentionally to make it run slowly so that the balancing activities won't affect the normal cluster activities and the running jobs. There are new use case that cluster admin may choose to balance the cluster when the cluster load is low, or in a maintain window. So that we should have an option to allow Balancer to run faster. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8823) Move replication factor into individual blocks
[ https://issues.apache.org/jira/browse/HDFS-8823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haohui Mai updated HDFS-8823: - Attachment: (was: HDFS-8823.WIP.patch) Move replication factor into individual blocks -- Key: HDFS-8823 URL: https://issues.apache.org/jira/browse/HDFS-8823 Project: Hadoop HDFS Issue Type: Improvement Reporter: Haohui Mai Assignee: Haohui Mai Attachments: HDFS-8823.000.patch This jira proposes to record the replication factor in the {{BlockInfo}} class. The changes have two advantages: * Decoupling the namespace and the block management layer. It is a prerequisite step to move block management off the heap or to a separate process. * Increased flexibility on replicating blocks. Currently the replication factors of all blocks have to be the same. The replication factors of these blocks are equal to the highest replication factor across all snapshots. The changes will allow blocks in a file to have different replication factor, potentially saving some space. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8823) Move replication factor into individual blocks
[ https://issues.apache.org/jira/browse/HDFS-8823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haohui Mai updated HDFS-8823: - Status: Patch Available (was: Open) Move replication factor into individual blocks -- Key: HDFS-8823 URL: https://issues.apache.org/jira/browse/HDFS-8823 Project: Hadoop HDFS Issue Type: Improvement Reporter: Haohui Mai Assignee: Haohui Mai Attachments: HDFS-8823.000.patch This jira proposes to record the replication factor in the {{BlockInfo}} class. The changes have two advantages: * Decoupling the namespace and the block management layer. It is a prerequisite step to move block management off the heap or to a separate process. * Increased flexibility on replicating blocks. Currently the replication factors of all blocks have to be the same. The replication factors of these blocks are equal to the highest replication factor across all snapshots. The changes will allow blocks in a file to have different replication factor, potentially saving some space. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8823) Move replication factor into individual blocks
[ https://issues.apache.org/jira/browse/HDFS-8823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haohui Mai updated HDFS-8823: - Attachment: HDFS-8823.000.patch Move replication factor into individual blocks -- Key: HDFS-8823 URL: https://issues.apache.org/jira/browse/HDFS-8823 Project: Hadoop HDFS Issue Type: Improvement Reporter: Haohui Mai Assignee: Haohui Mai Attachments: HDFS-8823.000.patch This jira proposes to record the replication factor in the {{BlockInfo}} class. The changes have two advantages: * Decoupling the namespace and the block management layer. It is a prerequisite step to move block management off the heap or to a separate process. * Increased flexibility on replicating blocks. Currently the replication factors of all blocks have to be the same. The replication factors of these blocks are equal to the highest replication factor across all snapshots. The changes will allow blocks in a file to have different replication factor, potentially saving some space. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8388) Time and Date format need to be in sync in Namenode UI page
[ https://issues.apache.org/jira/browse/HDFS-8388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14642325#comment-14642325 ] Akira AJISAKA commented on HDFS-8388: - Thanks [~surendrasingh] for updating the patch. Two comments from me. 1. Started time looks invalid. Would you fix it? 2. Would you use ddd MMM DD HH:mm:ss ZZ format instead of ddd MMM DD HH:mm:ss to be consistent with YARN Web UI? Time and Date format need to be in sync in Namenode UI page --- Key: HDFS-8388 URL: https://issues.apache.org/jira/browse/HDFS-8388 Project: Hadoop HDFS Issue Type: Bug Reporter: Archana T Assignee: Surendra Singh Lilhore Priority: Minor Attachments: HDFS-8388-002.patch, HDFS-8388.patch, HDFS-8388_1.patch In NameNode UI Page, Date and Time FORMAT displayed on the page are not in sync currently. Started:Wed May 13 12:28:02 IST 2015 Compiled:23 Apr 2015 12:22:59 Block Deletion Start Time 13 May 2015 12:28:02 We can keep a common format in all the above places. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8388) Time and Date format need to be in sync in Namenode UI page
[ https://issues.apache.org/jira/browse/HDFS-8388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akira AJISAKA updated HDFS-8388: Attachment: ScreenShot-InvalidDate.png Attaching a screenshot. !ScreenShot-InvalidDate.png! Time and Date format need to be in sync in Namenode UI page --- Key: HDFS-8388 URL: https://issues.apache.org/jira/browse/HDFS-8388 Project: Hadoop HDFS Issue Type: Bug Reporter: Archana T Assignee: Surendra Singh Lilhore Priority: Minor Attachments: HDFS-8388-002.patch, HDFS-8388.patch, HDFS-8388_1.patch, ScreenShot-InvalidDate.png In NameNode UI Page, Date and Time FORMAT displayed on the page are not in sync currently. Started:Wed May 13 12:28:02 IST 2015 Compiled:23 Apr 2015 12:22:59 Block Deletion Start Time 13 May 2015 12:28:02 We can keep a common format in all the above places. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8388) Time and Date format need to be in sync in Namenode UI page
[ https://issues.apache.org/jira/browse/HDFS-8388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14642351#comment-14642351 ] Akira AJISAKA commented on HDFS-8388: - bq. 1. Started time looks invalid. Would you fix it? Started time is already rendered by Date.toString(), so I'm thinking we don't need to re-format by moment.js. I have an additional comment: Would you use moment.min.js instead of moment.js to minimize the source code as HDFS-8816? Time and Date format need to be in sync in Namenode UI page --- Key: HDFS-8388 URL: https://issues.apache.org/jira/browse/HDFS-8388 Project: Hadoop HDFS Issue Type: Bug Reporter: Archana T Assignee: Surendra Singh Lilhore Priority: Minor Attachments: HDFS-8388-002.patch, HDFS-8388.patch, HDFS-8388_1.patch, ScreenShot-InvalidDate.png In NameNode UI Page, Date and Time FORMAT displayed on the page are not in sync currently. Started:Wed May 13 12:28:02 IST 2015 Compiled:23 Apr 2015 12:22:59 Block Deletion Start Time 13 May 2015 12:28:02 We can keep a common format in all the above places. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HDFS-8822) Add SSD storagepolicy tests in TestBlockStoragePolicy#testDefaultPolicies
Vinayakumar B created HDFS-8822: --- Summary: Add SSD storagepolicy tests in TestBlockStoragePolicy#testDefaultPolicies Key: HDFS-8822 URL: https://issues.apache.org/jira/browse/HDFS-8822 Project: Hadoop HDFS Issue Type: Improvement Reporter: Vinayakumar B Assignee: Vinayakumar B Add tests for storage policies ALLSSD and ONESSD in {{TestBlockStoragePolicy#testDefaultPolicies(..)}} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8822) Add SSD storagepolicy tests in TestBlockStoragePolicy#testDefaultPolicies
[ https://issues.apache.org/jira/browse/HDFS-8822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinayakumar B updated HDFS-8822: Status: Patch Available (was: Open) Add SSD storagepolicy tests in TestBlockStoragePolicy#testDefaultPolicies - Key: HDFS-8822 URL: https://issues.apache.org/jira/browse/HDFS-8822 Project: Hadoop HDFS Issue Type: Improvement Reporter: Vinayakumar B Assignee: Vinayakumar B Attachments: HDFS-8822-01.patch Add tests for storage policies ALLSSD and ONESSD in {{TestBlockStoragePolicy#testDefaultPolicies(..)}} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-8822) Add SSD storagepolicy tests in TestBlockStoragePolicy#testDefaultPolicies
[ https://issues.apache.org/jira/browse/HDFS-8822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinayakumar B updated HDFS-8822: Attachment: HDFS-8822-01.patch Attaching patch for same. Please review Add SSD storagepolicy tests in TestBlockStoragePolicy#testDefaultPolicies - Key: HDFS-8822 URL: https://issues.apache.org/jira/browse/HDFS-8822 Project: Hadoop HDFS Issue Type: Improvement Reporter: Vinayakumar B Assignee: Vinayakumar B Attachments: HDFS-8822-01.patch Add tests for storage policies ALLSSD and ONESSD in {{TestBlockStoragePolicy#testDefaultPolicies(..)}} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8811) Move BlockStoragePolicy name's constants from HdfsServerConstants.java to HdfsConstants.java
[ https://issues.apache.org/jira/browse/HDFS-8811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14642517#comment-14642517 ] Vinayakumar B commented on HDFS-8811: - Checkstyle and test failures are unrelated Move BlockStoragePolicy name's constants from HdfsServerConstants.java to HdfsConstants.java Key: HDFS-8811 URL: https://issues.apache.org/jira/browse/HDFS-8811 Project: Hadoop HDFS Issue Type: Improvement Reporter: Vinayakumar B Assignee: Vinayakumar B Attachments: HDFS-8811-01.patch Currently {{HdfsServerConstants.java}} have following constants, {code} String HOT_STORAGE_POLICY_NAME = HOT; String WARM_STORAGE_POLICY_NAME = WARM; String COLD_STORAGE_POLICY_NAME = COLD;{code} and {{HdfsConstants.java}} have the following {code} public static final String MEMORY_STORAGE_POLICY_NAME = LAZY_PERSIST; public static final String ALLSSD_STORAGE_POLICY_NAME = ALL_SSD; public static final String ONESSD_STORAGE_POLICY_NAME = ONE_SSD;{code} It would be better to move all these to one place HdfsConstants.java, which client APIs also could access since this presents in hdfs-client module. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-8811) Move BlockStoragePolicy name's constants from HdfsServerConstants.java to HdfsConstants.java
[ https://issues.apache.org/jira/browse/HDFS-8811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14642478#comment-14642478 ] Hadoop QA commented on HDFS-8811: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:red}-1{color} | pre-patch | 15m 37s | Findbugs (version ) appears to be broken on trunk. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 3 new or modified test files. | | {color:green}+1{color} | javac | 7m 35s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 9m 38s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 22s | The applied patch does not increase the total number of release audit warnings. | | {color:red}-1{color} | checkstyle | 1m 36s | The applied patch generated 6 new checkstyle issues (total was 0, now 6). | | {color:green}+1{color} | whitespace | 0m 0s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 36s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 33s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 4m 20s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | native | 3m 3s | Pre-build of native portion | | {color:red}-1{color} | hdfs tests | 159m 31s | Tests failed in hadoop-hdfs. | | {color:green}+1{color} | hdfs tests | 0m 27s | Tests passed in hadoop-hdfs-client. | | | | 204m 22s | | \\ \\ || Reason || Tests || | Failed unit tests | hadoop.hdfs.TestAppendSnapshotTruncate | | | hadoop.hdfs.TestLeaseRecovery2 | | | hadoop.hdfs.TestDistributedFileSystem | | | hadoop.hdfs.server.namenode.ha.TestStandbyIsHot | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12747284/HDFS-8811-01.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / 1df7868 | | checkstyle | https://builds.apache.org/job/PreCommit-HDFS-Build/11843/artifact/patchprocess/diffcheckstylehadoop-hdfs-client.txt | | hadoop-hdfs test log | https://builds.apache.org/job/PreCommit-HDFS-Build/11843/artifact/patchprocess/testrun_hadoop-hdfs.txt | | hadoop-hdfs-client test log | https://builds.apache.org/job/PreCommit-HDFS-Build/11843/artifact/patchprocess/testrun_hadoop-hdfs-client.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/11843/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf909.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/11843/console | This message was automatically generated. Move BlockStoragePolicy name's constants from HdfsServerConstants.java to HdfsConstants.java Key: HDFS-8811 URL: https://issues.apache.org/jira/browse/HDFS-8811 Project: Hadoop HDFS Issue Type: Improvement Reporter: Vinayakumar B Assignee: Vinayakumar B Attachments: HDFS-8811-01.patch Currently {{HdfsServerConstants.java}} have following constants, {code} String HOT_STORAGE_POLICY_NAME = HOT; String WARM_STORAGE_POLICY_NAME = WARM; String COLD_STORAGE_POLICY_NAME = COLD;{code} and {{HdfsConstants.java}} have the following {code} public static final String MEMORY_STORAGE_POLICY_NAME = LAZY_PERSIST; public static final String ALLSSD_STORAGE_POLICY_NAME = ALL_SSD; public static final String ONESSD_STORAGE_POLICY_NAME = ONE_SSD;{code} It would be better to move all these to one place HdfsConstants.java, which client APIs also could access since this presents in hdfs-client module. -- This message was sent by Atlassian JIRA (v6.3.4#6332)