[jira] [Commented] (HDFS-2832) Enable support for heterogeneous storages in HDFS - DN as a collection of storages
[ https://issues.apache.org/jira/browse/HDFS-2832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13911895#comment-13911895 ] Andrew Wang commented on HDFS-2832: --- Jiahongchao, we don't expose any user API yet for HSM. This is going to be added in HDFS-5682. It might have been a mistake to advertise HSM in the release note at this stage of development, since I imagine we'll be getting a lot of questions like this. Enable support for heterogeneous storages in HDFS - DN as a collection of storages -- Key: HDFS-2832 URL: https://issues.apache.org/jira/browse/HDFS-2832 Project: Hadoop HDFS Issue Type: New Feature Affects Versions: 0.24.0 Reporter: Suresh Srinivas Assignee: Suresh Srinivas Fix For: 3.0.0, 2.3.0 Attachments: 20130813-HeterogeneousStorage.pdf, 20131125-HeterogeneousStorage-TestPlan.pdf, 20131125-HeterogeneousStorage.pdf, 20131202-HeterogeneousStorage-TestPlan.pdf, 20131203-HeterogeneousStorage-TestPlan.pdf, H2832_20131107.patch, editsStored, editsStored, h2832_20131023.patch, h2832_20131023b.patch, h2832_20131025.patch, h2832_20131028.patch, h2832_20131028b.patch, h2832_20131029.patch, h2832_20131103.patch, h2832_20131104.patch, h2832_20131105.patch, h2832_20131107b.patch, h2832_20131108.patch, h2832_20131110.patch, h2832_20131110b.patch, h2832_2013.patch, h2832_20131112.patch, h2832_20131112b.patch, h2832_20131114.patch, h2832_20131118.patch, h2832_20131119.patch, h2832_20131119b.patch, h2832_20131121.patch, h2832_20131122.patch, h2832_20131122b.patch, h2832_20131123.patch, h2832_20131124.patch, h2832_20131202.patch, h2832_20131203.patch, h2832_20131210.patch, h2832_20131211.patch, h2832_20131211b.patch, h2832_branch-2_20131226.patch, h2832_branch-2_20140103.patch HDFS currently supports configuration where storages are a list of directories. Typically each of these directories correspond to a volume with its own file system. All these directories are homogeneous and therefore identified as a single storage at the namenode. I propose, change to the current model where Datanode * is a * storage, to Datanode * is a collection * of strorages. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HDFS-2832) Enable support for heterogeneous storages in HDFS - DN as a collection of storages
[ https://issues.apache.org/jira/browse/HDFS-2832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13911354#comment-13911354 ] Jiahongchao commented on HDFS-2832: --- hey guys, it seems that this feature has been included in hadoop 2.3. But how can I use it in my cluster? Where is the user document? Enable support for heterogeneous storages in HDFS - DN as a collection of storages -- Key: HDFS-2832 URL: https://issues.apache.org/jira/browse/HDFS-2832 Project: Hadoop HDFS Issue Type: New Feature Affects Versions: 0.24.0 Reporter: Suresh Srinivas Assignee: Suresh Srinivas Fix For: 3.0.0, 2.3.0 Attachments: 20130813-HeterogeneousStorage.pdf, 20131125-HeterogeneousStorage-TestPlan.pdf, 20131125-HeterogeneousStorage.pdf, 20131202-HeterogeneousStorage-TestPlan.pdf, 20131203-HeterogeneousStorage-TestPlan.pdf, H2832_20131107.patch, editsStored, editsStored, h2832_20131023.patch, h2832_20131023b.patch, h2832_20131025.patch, h2832_20131028.patch, h2832_20131028b.patch, h2832_20131029.patch, h2832_20131103.patch, h2832_20131104.patch, h2832_20131105.patch, h2832_20131107b.patch, h2832_20131108.patch, h2832_20131110.patch, h2832_20131110b.patch, h2832_2013.patch, h2832_20131112.patch, h2832_20131112b.patch, h2832_20131114.patch, h2832_20131118.patch, h2832_20131119.patch, h2832_20131119b.patch, h2832_20131121.patch, h2832_20131122.patch, h2832_20131122b.patch, h2832_20131123.patch, h2832_20131124.patch, h2832_20131202.patch, h2832_20131203.patch, h2832_20131210.patch, h2832_20131211.patch, h2832_20131211b.patch, h2832_branch-2_20131226.patch, h2832_branch-2_20140103.patch HDFS currently supports configuration where storages are a list of directories. Typically each of these directories correspond to a volume with its own file system. All these directories are homogeneous and therefore identified as a single storage at the namenode. I propose, change to the current model where Datanode * is a * storage, to Datanode * is a collection * of strorages. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HDFS-2832) Enable support for heterogeneous storages in HDFS
[ https://issues.apache.org/jira/browse/HDFS-2832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13864102#comment-13864102 ] Hudson commented on HDFS-2832: -- SUCCESS: Integrated in Hadoop-Yarn-trunk #445 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/445/]) HDFS-2832. Update CHANGES.txt to reflect merge to branch-2 (arp: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1556088) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt Enable support for heterogeneous storages in HDFS - Key: HDFS-2832 URL: https://issues.apache.org/jira/browse/HDFS-2832 Project: Hadoop HDFS Issue Type: New Feature Affects Versions: 0.24.0 Reporter: Suresh Srinivas Assignee: Suresh Srinivas Fix For: 3.0.0, 2.4.0 Attachments: 20130813-HeterogeneousStorage.pdf, 20131125-HeterogeneousStorage-TestPlan.pdf, 20131125-HeterogeneousStorage.pdf, 20131202-HeterogeneousStorage-TestPlan.pdf, 20131203-HeterogeneousStorage-TestPlan.pdf, H2832_20131107.patch, editsStored, editsStored, h2832_20131023.patch, h2832_20131023b.patch, h2832_20131025.patch, h2832_20131028.patch, h2832_20131028b.patch, h2832_20131029.patch, h2832_20131103.patch, h2832_20131104.patch, h2832_20131105.patch, h2832_20131107b.patch, h2832_20131108.patch, h2832_20131110.patch, h2832_20131110b.patch, h2832_2013.patch, h2832_20131112.patch, h2832_20131112b.patch, h2832_20131114.patch, h2832_20131118.patch, h2832_20131119.patch, h2832_20131119b.patch, h2832_20131121.patch, h2832_20131122.patch, h2832_20131122b.patch, h2832_20131123.patch, h2832_20131124.patch, h2832_20131202.patch, h2832_20131203.patch, h2832_20131210.patch, h2832_20131211.patch, h2832_20131211b.patch, h2832_branch-2_20131226.patch, h2832_branch-2_20140103.patch HDFS currently supports configuration where storages are a list of directories. Typically each of these directories correspond to a volume with its own file system. All these directories are homogeneous and therefore identified as a single storage at the namenode. I propose, change to the current model where Datanode * is a * storage, to Datanode * is a collection * of strorages. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HDFS-2832) Enable support for heterogeneous storages in HDFS
[ https://issues.apache.org/jira/browse/HDFS-2832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13864219#comment-13864219 ] Hudson commented on HDFS-2832: -- FAILURE: Integrated in Hadoop-Hdfs-trunk #1637 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1637/]) HDFS-2832. Update CHANGES.txt to reflect merge to branch-2 (arp: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1556088) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt Enable support for heterogeneous storages in HDFS - Key: HDFS-2832 URL: https://issues.apache.org/jira/browse/HDFS-2832 Project: Hadoop HDFS Issue Type: New Feature Affects Versions: 0.24.0 Reporter: Suresh Srinivas Assignee: Suresh Srinivas Fix For: 3.0.0, 2.4.0 Attachments: 20130813-HeterogeneousStorage.pdf, 20131125-HeterogeneousStorage-TestPlan.pdf, 20131125-HeterogeneousStorage.pdf, 20131202-HeterogeneousStorage-TestPlan.pdf, 20131203-HeterogeneousStorage-TestPlan.pdf, H2832_20131107.patch, editsStored, editsStored, h2832_20131023.patch, h2832_20131023b.patch, h2832_20131025.patch, h2832_20131028.patch, h2832_20131028b.patch, h2832_20131029.patch, h2832_20131103.patch, h2832_20131104.patch, h2832_20131105.patch, h2832_20131107b.patch, h2832_20131108.patch, h2832_20131110.patch, h2832_20131110b.patch, h2832_2013.patch, h2832_20131112.patch, h2832_20131112b.patch, h2832_20131114.patch, h2832_20131118.patch, h2832_20131119.patch, h2832_20131119b.patch, h2832_20131121.patch, h2832_20131122.patch, h2832_20131122b.patch, h2832_20131123.patch, h2832_20131124.patch, h2832_20131202.patch, h2832_20131203.patch, h2832_20131210.patch, h2832_20131211.patch, h2832_20131211b.patch, h2832_branch-2_20131226.patch, h2832_branch-2_20140103.patch HDFS currently supports configuration where storages are a list of directories. Typically each of these directories correspond to a volume with its own file system. All these directories are homogeneous and therefore identified as a single storage at the namenode. I propose, change to the current model where Datanode * is a * storage, to Datanode * is a collection * of strorages. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HDFS-2832) Enable support for heterogeneous storages in HDFS
[ https://issues.apache.org/jira/browse/HDFS-2832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13864283#comment-13864283 ] Hudson commented on HDFS-2832: -- SUCCESS: Integrated in Hadoop-Mapreduce-trunk #1662 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1662/]) HDFS-2832. Update CHANGES.txt to reflect merge to branch-2 (arp: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1556088) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt Enable support for heterogeneous storages in HDFS - Key: HDFS-2832 URL: https://issues.apache.org/jira/browse/HDFS-2832 Project: Hadoop HDFS Issue Type: New Feature Affects Versions: 0.24.0 Reporter: Suresh Srinivas Assignee: Suresh Srinivas Fix For: 3.0.0, 2.4.0 Attachments: 20130813-HeterogeneousStorage.pdf, 20131125-HeterogeneousStorage-TestPlan.pdf, 20131125-HeterogeneousStorage.pdf, 20131202-HeterogeneousStorage-TestPlan.pdf, 20131203-HeterogeneousStorage-TestPlan.pdf, H2832_20131107.patch, editsStored, editsStored, h2832_20131023.patch, h2832_20131023b.patch, h2832_20131025.patch, h2832_20131028.patch, h2832_20131028b.patch, h2832_20131029.patch, h2832_20131103.patch, h2832_20131104.patch, h2832_20131105.patch, h2832_20131107b.patch, h2832_20131108.patch, h2832_20131110.patch, h2832_20131110b.patch, h2832_2013.patch, h2832_20131112.patch, h2832_20131112b.patch, h2832_20131114.patch, h2832_20131118.patch, h2832_20131119.patch, h2832_20131119b.patch, h2832_20131121.patch, h2832_20131122.patch, h2832_20131122b.patch, h2832_20131123.patch, h2832_20131124.patch, h2832_20131202.patch, h2832_20131203.patch, h2832_20131210.patch, h2832_20131211.patch, h2832_20131211b.patch, h2832_branch-2_20131226.patch, h2832_branch-2_20140103.patch HDFS currently supports configuration where storages are a list of directories. Typically each of these directories correspond to a volume with its own file system. All these directories are homogeneous and therefore identified as a single storage at the namenode. I propose, change to the current model where Datanode * is a * storage, to Datanode * is a collection * of strorages. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HDFS-2832) Enable support for heterogeneous storages in HDFS
[ https://issues.apache.org/jira/browse/HDFS-2832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13863742#comment-13863742 ] Hudson commented on HDFS-2832: -- SUCCESS: Integrated in Hadoop-trunk-Commit #4965 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/4965/]) HDFS-2832. Update CHANGES.txt to reflect merge to branch-2 (arp: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1556088) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt Enable support for heterogeneous storages in HDFS - Key: HDFS-2832 URL: https://issues.apache.org/jira/browse/HDFS-2832 Project: Hadoop HDFS Issue Type: New Feature Affects Versions: 0.24.0 Reporter: Suresh Srinivas Assignee: Suresh Srinivas Fix For: 3.0.0, 2.4.0 Attachments: 20130813-HeterogeneousStorage.pdf, 20131125-HeterogeneousStorage-TestPlan.pdf, 20131125-HeterogeneousStorage.pdf, 20131202-HeterogeneousStorage-TestPlan.pdf, 20131203-HeterogeneousStorage-TestPlan.pdf, H2832_20131107.patch, editsStored, editsStored, h2832_20131023.patch, h2832_20131023b.patch, h2832_20131025.patch, h2832_20131028.patch, h2832_20131028b.patch, h2832_20131029.patch, h2832_20131103.patch, h2832_20131104.patch, h2832_20131105.patch, h2832_20131107b.patch, h2832_20131108.patch, h2832_20131110.patch, h2832_20131110b.patch, h2832_2013.patch, h2832_20131112.patch, h2832_20131112b.patch, h2832_20131114.patch, h2832_20131118.patch, h2832_20131119.patch, h2832_20131119b.patch, h2832_20131121.patch, h2832_20131122.patch, h2832_20131122b.patch, h2832_20131123.patch, h2832_20131124.patch, h2832_20131202.patch, h2832_20131203.patch, h2832_20131210.patch, h2832_20131211.patch, h2832_20131211b.patch, h2832_branch-2_20131226.patch, h2832_branch-2_20140103.patch HDFS currently supports configuration where storages are a list of directories. Typically each of these directories correspond to a volume with its own file system. All these directories are homogeneous and therefore identified as a single storage at the namenode. I propose, change to the current model where Datanode * is a * storage, to Datanode * is a collection * of strorages. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HDFS-2832) Enable support for heterogeneous storages in HDFS
[ https://issues.apache.org/jira/browse/HDFS-2832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13857272#comment-13857272 ] Hadoop QA commented on HDFS-2832: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12620579/editsStored against trunk revision . {color:red}-1 patch{color}. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/5795//console This message is automatically generated. Enable support for heterogeneous storages in HDFS - Key: HDFS-2832 URL: https://issues.apache.org/jira/browse/HDFS-2832 Project: Hadoop HDFS Issue Type: New Feature Affects Versions: 0.24.0 Reporter: Suresh Srinivas Assignee: Suresh Srinivas Fix For: 3.0.0 Attachments: 20130813-HeterogeneousStorage.pdf, 20131125-HeterogeneousStorage-TestPlan.pdf, 20131125-HeterogeneousStorage.pdf, 20131202-HeterogeneousStorage-TestPlan.pdf, 20131203-HeterogeneousStorage-TestPlan.pdf, H2832_20131107.patch, editsStored, editsStored, h2832_20131023.patch, h2832_20131023b.patch, h2832_20131025.patch, h2832_20131028.patch, h2832_20131028b.patch, h2832_20131029.patch, h2832_20131103.patch, h2832_20131104.patch, h2832_20131105.patch, h2832_20131107b.patch, h2832_20131108.patch, h2832_20131110.patch, h2832_20131110b.patch, h2832_2013.patch, h2832_20131112.patch, h2832_20131112b.patch, h2832_20131114.patch, h2832_20131118.patch, h2832_20131119.patch, h2832_20131119b.patch, h2832_20131121.patch, h2832_20131122.patch, h2832_20131122b.patch, h2832_20131123.patch, h2832_20131124.patch, h2832_20131202.patch, h2832_20131203.patch, h2832_20131210.patch, h2832_20131211.patch, h2832_20131211b.patch, h2832_branch-2_20131226.patch HDFS currently supports configuration where storages are a list of directories. Typically each of these directories correspond to a volume with its own file system. All these directories are homogeneous and therefore identified as a single storage at the namenode. I propose, change to the current model where Datanode * is a * storage, to Datanode * is a collection * of strorages. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HDFS-2832) Enable support for heterogeneous storages in HDFS
[ https://issues.apache.org/jira/browse/HDFS-2832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13852126#comment-13852126 ] Arpit Agarwal commented on HDFS-2832: - Phase 2 tasks will be tracked under HDFS-5682. Enable support for heterogeneous storages in HDFS - Key: HDFS-2832 URL: https://issues.apache.org/jira/browse/HDFS-2832 Project: Hadoop HDFS Issue Type: New Feature Affects Versions: 0.24.0 Reporter: Suresh Srinivas Assignee: Suresh Srinivas Attachments: 20130813-HeterogeneousStorage.pdf, 20131125-HeterogeneousStorage-TestPlan.pdf, 20131125-HeterogeneousStorage.pdf, 20131202-HeterogeneousStorage-TestPlan.pdf, 20131203-HeterogeneousStorage-TestPlan.pdf, H2832_20131107.patch, editsStored, h2832_20131023.patch, h2832_20131023b.patch, h2832_20131025.patch, h2832_20131028.patch, h2832_20131028b.patch, h2832_20131029.patch, h2832_20131103.patch, h2832_20131104.patch, h2832_20131105.patch, h2832_20131107b.patch, h2832_20131108.patch, h2832_20131110.patch, h2832_20131110b.patch, h2832_2013.patch, h2832_20131112.patch, h2832_20131112b.patch, h2832_20131114.patch, h2832_20131118.patch, h2832_20131119.patch, h2832_20131119b.patch, h2832_20131121.patch, h2832_20131122.patch, h2832_20131122b.patch, h2832_20131123.patch, h2832_20131124.patch, h2832_20131202.patch, h2832_20131203.patch, h2832_20131210.patch, h2832_20131211.patch, h2832_20131211b.patch HDFS currently supports configuration where storages are a list of directories. Typically each of these directories correspond to a volume with its own file system. All these directories are homogeneous and therefore identified as a single storage at the namenode. I propose, change to the current model where Datanode * is a * storage, to Datanode * is a collection * of strorages. -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Commented] (HDFS-2832) Enable support for heterogeneous storages in HDFS
[ https://issues.apache.org/jira/browse/HDFS-2832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13847392#comment-13847392 ] Hudson commented on HDFS-2832: -- FAILURE: Integrated in Hadoop-Yarn-trunk #420 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/420/]) HDFS-2832: Update binary file editsStored for TestOfflineEditsViewer (arp: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1550364) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/resources/editsStored svn merge --reintegrate https://svn.apache.org/repos/asf/hadoop/common/branches/HDFS-2832 for merging Heterogeneous Storage feature branch (arp: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1550363) * /hadoop/common/trunk * /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/docs * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/core * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSInputStream.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSOutputStream.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSUtil.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/StorageType.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/BlockListAsLongs.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/ClientProtocol.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/DatanodeID.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/DatanodeInfo.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/LayoutVersion.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/LocatedBlock.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/UnregisteredNodeException.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/ClientNamenodeProtocolServerSideTranslatorPB.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/ClientNamenodeProtocolTranslatorPB.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/DatanodeProtocolClientSideTranslatorPB.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/DatanodeProtocolServerSideTranslatorPB.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/InterDatanodeProtocolServerSideTranslatorPB.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/InterDatanodeProtocolTranslatorPB.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/PBHelper.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/Balancer.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockCollection.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockInfo.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockInfoUnderConstruction.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicy.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyDefault.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyWithNodeGroup.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlocksMap.java *
[jira] [Commented] (HDFS-2832) Enable support for heterogeneous storages in HDFS
[ https://issues.apache.org/jira/browse/HDFS-2832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13847538#comment-13847538 ] Hudson commented on HDFS-2832: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk #1637 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1637/]) HDFS-2832: Update binary file editsStored for TestOfflineEditsViewer (arp: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1550364) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/resources/editsStored svn merge --reintegrate https://svn.apache.org/repos/asf/hadoop/common/branches/HDFS-2832 for merging Heterogeneous Storage feature branch (arp: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1550363) * /hadoop/common/trunk * /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/docs * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/core * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSInputStream.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSOutputStream.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSUtil.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/StorageType.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/BlockListAsLongs.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/ClientProtocol.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/DatanodeID.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/DatanodeInfo.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/LayoutVersion.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/LocatedBlock.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/UnregisteredNodeException.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/ClientNamenodeProtocolServerSideTranslatorPB.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/ClientNamenodeProtocolTranslatorPB.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/DatanodeProtocolClientSideTranslatorPB.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/DatanodeProtocolServerSideTranslatorPB.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/InterDatanodeProtocolServerSideTranslatorPB.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/InterDatanodeProtocolTranslatorPB.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/PBHelper.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/Balancer.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockCollection.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockInfo.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockInfoUnderConstruction.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicy.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyDefault.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyWithNodeGroup.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlocksMap.java *
[jira] [Commented] (HDFS-2832) Enable support for heterogeneous storages in HDFS
[ https://issues.apache.org/jira/browse/HDFS-2832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13846134#comment-13846134 ] Arpit Agarwal commented on HDFS-2832: - Merged the HDFS-2832 feature branch to trunk. Thanks Nicholas, Suresh, Junping and Eric! Enable support for heterogeneous storages in HDFS - Key: HDFS-2832 URL: https://issues.apache.org/jira/browse/HDFS-2832 Project: Hadoop HDFS Issue Type: New Feature Affects Versions: 0.24.0 Reporter: Suresh Srinivas Assignee: Suresh Srinivas Attachments: 20130813-HeterogeneousStorage.pdf, 20131125-HeterogeneousStorage-TestPlan.pdf, 20131125-HeterogeneousStorage.pdf, 20131202-HeterogeneousStorage-TestPlan.pdf, 20131203-HeterogeneousStorage-TestPlan.pdf, H2832_20131107.patch, editsStored, h2832_20131023.patch, h2832_20131023b.patch, h2832_20131025.patch, h2832_20131028.patch, h2832_20131028b.patch, h2832_20131029.patch, h2832_20131103.patch, h2832_20131104.patch, h2832_20131105.patch, h2832_20131107b.patch, h2832_20131108.patch, h2832_20131110.patch, h2832_20131110b.patch, h2832_2013.patch, h2832_20131112.patch, h2832_20131112b.patch, h2832_20131114.patch, h2832_20131118.patch, h2832_20131119.patch, h2832_20131119b.patch, h2832_20131121.patch, h2832_20131122.patch, h2832_20131122b.patch, h2832_20131123.patch, h2832_20131124.patch, h2832_20131202.patch, h2832_20131203.patch, h2832_20131210.patch, h2832_20131211.patch, h2832_20131211b.patch HDFS currently supports configuration where storages are a list of directories. Typically each of these directories correspond to a volume with its own file system. All these directories are homogeneous and therefore identified as a single storage at the namenode. I propose, change to the current model where Datanode * is a * storage, to Datanode * is a collection * of strorages. -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Commented] (HDFS-2832) Enable support for heterogeneous storages in HDFS
[ https://issues.apache.org/jira/browse/HDFS-2832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13846143#comment-13846143 ] Hudson commented on HDFS-2832: -- SUCCESS: Integrated in Hadoop-trunk-Commit #4872 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/4872/]) HDFS-2832: Update binary file editsStored for TestOfflineEditsViewer (arp: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1550364) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/resources/editsStored Enable support for heterogeneous storages in HDFS - Key: HDFS-2832 URL: https://issues.apache.org/jira/browse/HDFS-2832 Project: Hadoop HDFS Issue Type: New Feature Affects Versions: 0.24.0 Reporter: Suresh Srinivas Assignee: Suresh Srinivas Attachments: 20130813-HeterogeneousStorage.pdf, 20131125-HeterogeneousStorage-TestPlan.pdf, 20131125-HeterogeneousStorage.pdf, 20131202-HeterogeneousStorage-TestPlan.pdf, 20131203-HeterogeneousStorage-TestPlan.pdf, H2832_20131107.patch, editsStored, h2832_20131023.patch, h2832_20131023b.patch, h2832_20131025.patch, h2832_20131028.patch, h2832_20131028b.patch, h2832_20131029.patch, h2832_20131103.patch, h2832_20131104.patch, h2832_20131105.patch, h2832_20131107b.patch, h2832_20131108.patch, h2832_20131110.patch, h2832_20131110b.patch, h2832_2013.patch, h2832_20131112.patch, h2832_20131112b.patch, h2832_20131114.patch, h2832_20131118.patch, h2832_20131119.patch, h2832_20131119b.patch, h2832_20131121.patch, h2832_20131122.patch, h2832_20131122b.patch, h2832_20131123.patch, h2832_20131124.patch, h2832_20131202.patch, h2832_20131203.patch, h2832_20131210.patch, h2832_20131211.patch, h2832_20131211b.patch HDFS currently supports configuration where storages are a list of directories. Typically each of these directories correspond to a volume with its own file system. All these directories are homogeneous and therefore identified as a single storage at the namenode. I propose, change to the current model where Datanode * is a * storage, to Datanode * is a collection * of strorages. -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Commented] (HDFS-2832) Enable support for heterogeneous storages in HDFS
[ https://issues.apache.org/jira/browse/HDFS-2832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13846370#comment-13846370 ] Hudson commented on HDFS-2832: -- FAILURE: Integrated in Hadoop-Hdfs-trunk #1610 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1610/]) HDFS-2832: Update binary file editsStored for TestOfflineEditsViewer (arp: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1550364) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/resources/editsStored svn merge --reintegrate https://svn.apache.org/repos/asf/hadoop/common/branches/HDFS-2832 for merging Heterogeneous Storage feature branch (arp: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1550363) * /hadoop/common/trunk * /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/docs * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/core * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSInputStream.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSOutputStream.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSUtil.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/StorageType.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/BlockListAsLongs.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/ClientProtocol.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/DatanodeID.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/DatanodeInfo.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/LayoutVersion.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/LocatedBlock.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/UnregisteredNodeException.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/ClientNamenodeProtocolServerSideTranslatorPB.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/ClientNamenodeProtocolTranslatorPB.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/DatanodeProtocolClientSideTranslatorPB.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/DatanodeProtocolServerSideTranslatorPB.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/InterDatanodeProtocolServerSideTranslatorPB.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/InterDatanodeProtocolTranslatorPB.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/PBHelper.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/Balancer.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockCollection.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockInfo.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockInfoUnderConstruction.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicy.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyDefault.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyWithNodeGroup.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlocksMap.java *
[jira] [Commented] (HDFS-2832) Enable support for heterogeneous storages in HDFS
[ https://issues.apache.org/jira/browse/HDFS-2832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13846681#comment-13846681 ] Suresh Srinivas commented on HDFS-2832: --- Thanks for this significant contribution [~arpitagarwal] and [~szetszwo]! Enable support for heterogeneous storages in HDFS - Key: HDFS-2832 URL: https://issues.apache.org/jira/browse/HDFS-2832 Project: Hadoop HDFS Issue Type: New Feature Affects Versions: 0.24.0 Reporter: Suresh Srinivas Assignee: Suresh Srinivas Attachments: 20130813-HeterogeneousStorage.pdf, 20131125-HeterogeneousStorage-TestPlan.pdf, 20131125-HeterogeneousStorage.pdf, 20131202-HeterogeneousStorage-TestPlan.pdf, 20131203-HeterogeneousStorage-TestPlan.pdf, H2832_20131107.patch, editsStored, h2832_20131023.patch, h2832_20131023b.patch, h2832_20131025.patch, h2832_20131028.patch, h2832_20131028b.patch, h2832_20131029.patch, h2832_20131103.patch, h2832_20131104.patch, h2832_20131105.patch, h2832_20131107b.patch, h2832_20131108.patch, h2832_20131110.patch, h2832_20131110b.patch, h2832_2013.patch, h2832_20131112.patch, h2832_20131112b.patch, h2832_20131114.patch, h2832_20131118.patch, h2832_20131119.patch, h2832_20131119b.patch, h2832_20131121.patch, h2832_20131122.patch, h2832_20131122b.patch, h2832_20131123.patch, h2832_20131124.patch, h2832_20131202.patch, h2832_20131203.patch, h2832_20131210.patch, h2832_20131211.patch, h2832_20131211b.patch HDFS currently supports configuration where storages are a list of directories. Typically each of these directories correspond to a volume with its own file system. All these directories are homogeneous and therefore identified as a single storage at the namenode. I propose, change to the current model where Datanode * is a * storage, to Datanode * is a collection * of strorages. -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Commented] (HDFS-2832) Enable support for heterogeneous storages in HDFS
[ https://issues.apache.org/jira/browse/HDFS-2832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13845992#comment-13845992 ] Hadoop QA commented on HDFS-2832: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12618314/h2832_20131211.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 48 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:red}-1 javadoc{color}. The javadoc tool appears to have generated -12 warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.tools.offlineEditsViewer.TestOfflineEditsViewer org.apache.hadoop.hdfs.server.balancer.TestBalancerWithNodeGroup org.apache.hadoop.hdfs.server.balancer.TestBalancerWithEncryptedTransfer {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/5699//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/5699//console This message is automatically generated. Enable support for heterogeneous storages in HDFS - Key: HDFS-2832 URL: https://issues.apache.org/jira/browse/HDFS-2832 Project: Hadoop HDFS Issue Type: New Feature Affects Versions: 0.24.0 Reporter: Suresh Srinivas Assignee: Suresh Srinivas Attachments: 20130813-HeterogeneousStorage.pdf, 20131125-HeterogeneousStorage-TestPlan.pdf, 20131125-HeterogeneousStorage.pdf, 20131202-HeterogeneousStorage-TestPlan.pdf, 20131203-HeterogeneousStorage-TestPlan.pdf, H2832_20131107.patch, editsStored, h2832_20131023.patch, h2832_20131023b.patch, h2832_20131025.patch, h2832_20131028.patch, h2832_20131028b.patch, h2832_20131029.patch, h2832_20131103.patch, h2832_20131104.patch, h2832_20131105.patch, h2832_20131107b.patch, h2832_20131108.patch, h2832_20131110.patch, h2832_20131110b.patch, h2832_2013.patch, h2832_20131112.patch, h2832_20131112b.patch, h2832_20131114.patch, h2832_20131118.patch, h2832_20131119.patch, h2832_20131119b.patch, h2832_20131121.patch, h2832_20131122.patch, h2832_20131122b.patch, h2832_20131123.patch, h2832_20131124.patch, h2832_20131202.patch, h2832_20131203.patch, h2832_20131210.patch, h2832_20131211.patch HDFS currently supports configuration where storages are a list of directories. Typically each of these directories correspond to a volume with its own file system. All these directories are homogeneous and therefore identified as a single storage at the namenode. I propose, change to the current model where Datanode * is a * storage, to Datanode * is a collection * of strorages. -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Commented] (HDFS-2832) Enable support for heterogeneous storages in HDFS
[ https://issues.apache.org/jira/browse/HDFS-2832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13846091#comment-13846091 ] Hadoop QA commented on HDFS-2832: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12618347/h2832_20131211b.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 48 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.tools.offlineEditsViewer.TestOfflineEditsViewer {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/5701//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/5701//console This message is automatically generated. Enable support for heterogeneous storages in HDFS - Key: HDFS-2832 URL: https://issues.apache.org/jira/browse/HDFS-2832 Project: Hadoop HDFS Issue Type: New Feature Affects Versions: 0.24.0 Reporter: Suresh Srinivas Assignee: Suresh Srinivas Attachments: 20130813-HeterogeneousStorage.pdf, 20131125-HeterogeneousStorage-TestPlan.pdf, 20131125-HeterogeneousStorage.pdf, 20131202-HeterogeneousStorage-TestPlan.pdf, 20131203-HeterogeneousStorage-TestPlan.pdf, H2832_20131107.patch, editsStored, h2832_20131023.patch, h2832_20131023b.patch, h2832_20131025.patch, h2832_20131028.patch, h2832_20131028b.patch, h2832_20131029.patch, h2832_20131103.patch, h2832_20131104.patch, h2832_20131105.patch, h2832_20131107b.patch, h2832_20131108.patch, h2832_20131110.patch, h2832_20131110b.patch, h2832_2013.patch, h2832_20131112.patch, h2832_20131112b.patch, h2832_20131114.patch, h2832_20131118.patch, h2832_20131119.patch, h2832_20131119b.patch, h2832_20131121.patch, h2832_20131122.patch, h2832_20131122b.patch, h2832_20131123.patch, h2832_20131124.patch, h2832_20131202.patch, h2832_20131203.patch, h2832_20131210.patch, h2832_20131211.patch, h2832_20131211b.patch HDFS currently supports configuration where storages are a list of directories. Typically each of these directories correspond to a volume with its own file system. All these directories are homogeneous and therefore identified as a single storage at the namenode. I propose, change to the current model where Datanode * is a * storage, to Datanode * is a collection * of strorages. -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Commented] (HDFS-2832) Enable support for heterogeneous storages in HDFS
[ https://issues.apache.org/jira/browse/HDFS-2832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13846131#comment-13846131 ] Hudson commented on HDFS-2832: -- SUCCESS: Integrated in Hadoop-trunk-Commit #4871 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/4871/]) svn merge --reintegrate https://svn.apache.org/repos/asf/hadoop/common/branches/HDFS-2832 for merging Heterogeneous Storage feature branch (arp: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1550363) * /hadoop/common/trunk * /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/docs * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/core * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSInputStream.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSOutputStream.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSUtil.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/StorageType.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/BlockListAsLongs.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/ClientProtocol.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/DatanodeID.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/DatanodeInfo.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/LayoutVersion.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/LocatedBlock.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/UnregisteredNodeException.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/ClientNamenodeProtocolServerSideTranslatorPB.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/ClientNamenodeProtocolTranslatorPB.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/DatanodeProtocolClientSideTranslatorPB.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/DatanodeProtocolServerSideTranslatorPB.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/InterDatanodeProtocolServerSideTranslatorPB.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/InterDatanodeProtocolTranslatorPB.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/PBHelper.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/Balancer.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockCollection.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockInfo.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockInfoUnderConstruction.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicy.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyDefault.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyWithNodeGroup.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlocksMap.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeDescriptor.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeManager.java *
[jira] [Commented] (HDFS-2832) Enable support for heterogeneous storages in HDFS
[ https://issues.apache.org/jira/browse/HDFS-2832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13844897#comment-13844897 ] Hadoop QA commented on HDFS-2832: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12618118/h2832_20131210.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 48 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.server.namenode.TestListCorruptFileBlocks org.apache.hadoop.hdfs.TestDFSUpgradeFromImage org.apache.hadoop.hdfs.server.datanode.TestBlockReplacement org.apache.hadoop.hdfs.tools.offlineEditsViewer.TestOfflineEditsViewer org.apache.hadoop.hdfs.server.blockmanagement.TestNodeCount org.apache.hadoop.hdfs.server.namenode.metrics.TestNameNodeMetrics The following test timeouts occurred in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.server.namenode.TestFsck org.apache.hadoop.hdfs.server.blockmanagement.TestOverReplicatedBlocks {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/5690//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/5690//console This message is automatically generated. Enable support for heterogeneous storages in HDFS - Key: HDFS-2832 URL: https://issues.apache.org/jira/browse/HDFS-2832 Project: Hadoop HDFS Issue Type: New Feature Affects Versions: 0.24.0 Reporter: Suresh Srinivas Assignee: Suresh Srinivas Attachments: 20130813-HeterogeneousStorage.pdf, 20131125-HeterogeneousStorage-TestPlan.pdf, 20131125-HeterogeneousStorage.pdf, 20131202-HeterogeneousStorage-TestPlan.pdf, 20131203-HeterogeneousStorage-TestPlan.pdf, H2832_20131107.patch, editsStored, h2832_20131023.patch, h2832_20131023b.patch, h2832_20131025.patch, h2832_20131028.patch, h2832_20131028b.patch, h2832_20131029.patch, h2832_20131103.patch, h2832_20131104.patch, h2832_20131105.patch, h2832_20131107b.patch, h2832_20131108.patch, h2832_20131110.patch, h2832_20131110b.patch, h2832_2013.patch, h2832_20131112.patch, h2832_20131112b.patch, h2832_20131114.patch, h2832_20131118.patch, h2832_20131119.patch, h2832_20131119b.patch, h2832_20131121.patch, h2832_20131122.patch, h2832_20131122b.patch, h2832_20131123.patch, h2832_20131124.patch, h2832_20131202.patch, h2832_20131203.patch, h2832_20131210.patch HDFS currently supports configuration where storages are a list of directories. Typically each of these directories correspond to a volume with its own file system. All these directories are homogeneous and therefore identified as a single storage at the namenode. I propose, change to the current model where Datanode * is a * storage, to Datanode * is a collection * of strorages. -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Commented] (HDFS-2832) Enable support for heterogeneous storages in HDFS
[ https://issues.apache.org/jira/browse/HDFS-2832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13843549#comment-13843549 ] Arpit Agarwal commented on HDFS-2832: - {quote} I bring them up again because, 4 months later, I was wondering if you had any thoughts on potential solutions that could be added to the doc. It's fine if automatic migration, open files, more elaborate resource management, and additional storage types are all not in immediate scope, but I assume we'll want them in the future. {quote} Andrew, automatic migration is not in scope for our design. wrt open files can you describe a specific use case you think we should be handling that we have not described? Maybe it will help me understand your concern better. If you are concerned about reclaiming capacity for in use blocks, that is analogous to asking If a process keeps a long-lived handle to a file what will the operating system do to reclaim disk space used by the file? and the answer is the same - nothing. I don't want anyone reading your comments to get a false impression that the feature is incompatible with SCR. {quote} Well, CDH supports rolling upgrades in some situations. ATM is working on metadata upgrade with HA enabled (HDFS-5138) and I've seen some recent JIRAs related to rolling upgrade (HDFS-5535), so it seems like a reasonable question. At least at the protobuf level, everything so far looks compatible, so I thought it might work as long as the handler code is compatible too. {quote} I am not familiar with how CDH does rolling upgrades so I cannot tell you whether it will work. You recently bumped the layout version for caching so you might recall that HDFS layout version checks prevent a DN registering with an NN with a mismatched version. To my knowledge HDFS-5535 will not fix this limitation either. That said, we have retained wire-protocol compatibility. {quote} Do you forsee heartbeats and block reports always being combined in realistic scenarios? Or are there reasons to split it? Is there any additional overhead from splitting? Can we save any complexity by not supporting split reports? I see this on the test matrix. {quote} I thought I answered it, maybe if you describe your concerns I can give you a better answer. When the test plan says 'split' I meant splitting the reports across multiple requests. Reports will always be split by storage but we are not splitting them across multiple messages for now. What kind of overhead are you thinking of? {quote} b1. Have you put any thought about metrics and tooling to help users and admins debug their quota usage and issues with migrating files to certain storage types? {quote} We'll include it in the next design rev as we start phase 2. Enable support for heterogeneous storages in HDFS - Key: HDFS-2832 URL: https://issues.apache.org/jira/browse/HDFS-2832 Project: Hadoop HDFS Issue Type: New Feature Affects Versions: 0.24.0 Reporter: Suresh Srinivas Assignee: Suresh Srinivas Attachments: 20130813-HeterogeneousStorage.pdf, 20131125-HeterogeneousStorage-TestPlan.pdf, 20131125-HeterogeneousStorage.pdf, 20131202-HeterogeneousStorage-TestPlan.pdf, 20131203-HeterogeneousStorage-TestPlan.pdf, H2832_20131107.patch, editsStored, h2832_20131023.patch, h2832_20131023b.patch, h2832_20131025.patch, h2832_20131028.patch, h2832_20131028b.patch, h2832_20131029.patch, h2832_20131103.patch, h2832_20131104.patch, h2832_20131105.patch, h2832_20131107b.patch, h2832_20131108.patch, h2832_20131110.patch, h2832_20131110b.patch, h2832_2013.patch, h2832_20131112.patch, h2832_20131112b.patch, h2832_20131114.patch, h2832_20131118.patch, h2832_20131119.patch, h2832_20131119b.patch, h2832_20131121.patch, h2832_20131122.patch, h2832_20131122b.patch, h2832_20131123.patch, h2832_20131124.patch, h2832_20131202.patch, h2832_20131203.patch HDFS currently supports configuration where storages are a list of directories. Typically each of these directories correspond to a volume with its own file system. All these directories are homogeneous and therefore identified as a single storage at the namenode. I propose, change to the current model where Datanode * is a * storage, to Datanode * is a collection * of strorages. -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Commented] (HDFS-2832) Enable support for heterogeneous storages in HDFS
[ https://issues.apache.org/jira/browse/HDFS-2832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13842846#comment-13842846 ] Andrew Wang commented on HDFS-2832: --- I'm doing my best here to make this a friendly technical discussion. I guess my timing here is unfortunate with the merge vote pending, but since I don't intend to vote -1, this is just me reading the design doc again and posting more comments. Please take it as such. I'll restate that I think everything in the branch so far looks great. Arpit, I actually reviewed the previous comment history very carefully before posting, as well as the updated design document. Yes, I know that I made some of these comments before, but I brought them up in the first place because I felt like they were interesting problems related to this feature. I bring them up again because, 4 months later, I was wondering if you had any thoughts on potential solutions that could be added to the doc. It's fine if automatic migration, open files, more elaborate resource management, and additional storage types are all not in immediate scope, but I assume we'll want them in the future. I'd also like if the design doc was reworked to reflect what's in phase 1 vs. phase 2 vs. future work. This would also save me from making comments I apparently shouldn't be making on WIP sections (e.g. 6.2, 6.4). I'll say this again too, if phase 1 is done and phase 2 is starting, it seems like the design of phase 2 is exactly what be the focus of our attention right now. The back and forth: I mention SCR over and over again because HBase is very interested in using SSDs, and I figured supporting one of our biggest downstream projects would be a prime use case and potentially in scope. I'd be sad if not, but it'd be good to at least gather their requirements and see how we might get there. bq. HDFS does not support rolling upgrades today. Well, CDH supports rolling upgrades in some situations. ATM is working on metadata upgrade with HA enabled (HDFS-5138) and I've seen some recent JIRAs related to rolling upgrade (HDFS-5535), so it seems like a reasonable question. At least at the protobuf level, everything so far looks compatible, so I thought it might work as long as the handler code is compatible too. These question were also not answered: bq. Do you forsee heartbeats and block reports always being combined in realistic scenarios? Or are there reasons to split it? Is there any additional overhead from splitting? Can we save any complexity by not supporting split reports? I see this on the test matrix. b1. Have you put any thought about metrics and tooling to help users and admins debug their quota usage and issues with migrating files to certain storage types? Especially because of SCR. Thanks Arpit. Again, this feedback isn't really merge related, it's just technical discussion. Not trying to block anything here. Enable support for heterogeneous storages in HDFS - Key: HDFS-2832 URL: https://issues.apache.org/jira/browse/HDFS-2832 Project: Hadoop HDFS Issue Type: New Feature Affects Versions: 0.24.0 Reporter: Suresh Srinivas Assignee: Suresh Srinivas Attachments: 20130813-HeterogeneousStorage.pdf, 20131125-HeterogeneousStorage-TestPlan.pdf, 20131125-HeterogeneousStorage.pdf, 20131202-HeterogeneousStorage-TestPlan.pdf, 20131203-HeterogeneousStorage-TestPlan.pdf, H2832_20131107.patch, editsStored, h2832_20131023.patch, h2832_20131023b.patch, h2832_20131025.patch, h2832_20131028.patch, h2832_20131028b.patch, h2832_20131029.patch, h2832_20131103.patch, h2832_20131104.patch, h2832_20131105.patch, h2832_20131107b.patch, h2832_20131108.patch, h2832_20131110.patch, h2832_20131110b.patch, h2832_2013.patch, h2832_20131112.patch, h2832_20131112b.patch, h2832_20131114.patch, h2832_20131118.patch, h2832_20131119.patch, h2832_20131119b.patch, h2832_20131121.patch, h2832_20131122.patch, h2832_20131122b.patch, h2832_20131123.patch, h2832_20131124.patch, h2832_20131202.patch, h2832_20131203.patch HDFS currently supports configuration where storages are a list of directories. Typically each of these directories correspond to a volume with its own file system. All these directories are homogeneous and therefore identified as a single storage at the namenode. I propose, change to the current model where Datanode * is a * storage, to Datanode * is a collection * of strorages. -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Commented] (HDFS-2832) Enable support for heterogeneous storages in HDFS
[ https://issues.apache.org/jira/browse/HDFS-2832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13841875#comment-13841875 ] Andrew Wang commented on HDFS-2832: --- Hi folks, sorry for just now looping back to this JIRA. I read the updated design doc, and had a few more questions: * Storage is not always truly hierarchical, it depends on your provisioning. The current strategy of always falling back from SSD to HDD is more ambiguous when you have more than two storage types, especially with something like a tape or NAS tier. Maybe this should be configurable somehow. * I'd like to see more discussion in the doc of migrating blocks that are currently open for short-circuit read. SCR is very common, and if this isn't handled, it makes it hard to use for an application like HBase which tries to opens all of its files once via SCR. FWIW, the HBase committers I've talked to are very interested in HSM and SSDs, so it might be helpful to get their thoughts on this topic and the feature more generally. * Is this going to work with rolling upgrades? * Do you forsee heartbeats and block reports always being combined in realistic scenarios? Or are there reasons to split it? Is there any additional overhead from splitting? Can we save any complexity by not supporting split reports? I see this on the test matrix. * Have you looked at the additional memory overhead on the NN and DN from splitting up storages? With 10 disks on a DN, this could mean effectively 10x the number of DNs as before. I think this is still insignificant, but you all know better than me. * I'd like to see more description of the client API, namely the file attribute APIs. I'll also note that LocatedBlock is not a public API; you can hack around by downcasting BlockLocation to HdfsBlockLocation to fish out the LocatedBlock, but ultimately we probably want to expose StorageType in BlockLocation itself. API examples would be great, from both the command line and the programmatic API. * Have you put any thought about metrics and tooling to help users and admins debug their quota usage and issues with migrating files to certain storage types? Especially because of SCR. * One of the mentioned potential uses is to do automatic migration between storage types based on usage patterns. In this kind of scenario, it's necessary to support more expressive forms of resource management, e.g. YARN's fair scheduler. Quotas by themselves aren't sufficient. * I think this earlier question/answer didn't make it into the doc: what happens when this file is distcp'd or copied? Arpit's earlier answer of clearing this field makes sense (or maybe we need a {{cp -a}} command). Enable support for heterogeneous storages in HDFS - Key: HDFS-2832 URL: https://issues.apache.org/jira/browse/HDFS-2832 Project: Hadoop HDFS Issue Type: New Feature Affects Versions: 0.24.0 Reporter: Suresh Srinivas Assignee: Suresh Srinivas Attachments: 20130813-HeterogeneousStorage.pdf, 20131125-HeterogeneousStorage-TestPlan.pdf, 20131125-HeterogeneousStorage.pdf, 20131202-HeterogeneousStorage-TestPlan.pdf, 20131203-HeterogeneousStorage-TestPlan.pdf, H2832_20131107.patch, editsStored, h2832_20131023.patch, h2832_20131023b.patch, h2832_20131025.patch, h2832_20131028.patch, h2832_20131028b.patch, h2832_20131029.patch, h2832_20131103.patch, h2832_20131104.patch, h2832_20131105.patch, h2832_20131107b.patch, h2832_20131108.patch, h2832_20131110.patch, h2832_20131110b.patch, h2832_2013.patch, h2832_20131112.patch, h2832_20131112b.patch, h2832_20131114.patch, h2832_20131118.patch, h2832_20131119.patch, h2832_20131119b.patch, h2832_20131121.patch, h2832_20131122.patch, h2832_20131122b.patch, h2832_20131123.patch, h2832_20131124.patch, h2832_20131202.patch, h2832_20131203.patch HDFS currently supports configuration where storages are a list of directories. Typically each of these directories correspond to a volume with its own file system. All these directories are homogeneous and therefore identified as a single storage at the namenode. I propose, change to the current model where Datanode * is a * storage, to Datanode * is a collection * of strorages. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HDFS-2832) Enable support for heterogeneous storages in HDFS
[ https://issues.apache.org/jira/browse/HDFS-2832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13841935#comment-13841935 ] Suresh Srinivas commented on HDFS-2832: --- bq. Hi folks, sorry for just now looping back to this JIRA. Is this not too late to loop back now, after design published and work started many months ago? Doing this after the merge vote is called (with 3 days to wrap up the voting) seems like a strange choice of timing to me. As regards to client APIs, we can certainly discuss them post phase 1 merge and when the work starts on phase 2 in relevant jiras. Hopefully [~arpitagarwal] can provide answers to the technical questions. Enable support for heterogeneous storages in HDFS - Key: HDFS-2832 URL: https://issues.apache.org/jira/browse/HDFS-2832 Project: Hadoop HDFS Issue Type: New Feature Affects Versions: 0.24.0 Reporter: Suresh Srinivas Assignee: Suresh Srinivas Attachments: 20130813-HeterogeneousStorage.pdf, 20131125-HeterogeneousStorage-TestPlan.pdf, 20131125-HeterogeneousStorage.pdf, 20131202-HeterogeneousStorage-TestPlan.pdf, 20131203-HeterogeneousStorage-TestPlan.pdf, H2832_20131107.patch, editsStored, h2832_20131023.patch, h2832_20131023b.patch, h2832_20131025.patch, h2832_20131028.patch, h2832_20131028b.patch, h2832_20131029.patch, h2832_20131103.patch, h2832_20131104.patch, h2832_20131105.patch, h2832_20131107b.patch, h2832_20131108.patch, h2832_20131110.patch, h2832_20131110b.patch, h2832_2013.patch, h2832_20131112.patch, h2832_20131112b.patch, h2832_20131114.patch, h2832_20131118.patch, h2832_20131119.patch, h2832_20131119b.patch, h2832_20131121.patch, h2832_20131122.patch, h2832_20131122b.patch, h2832_20131123.patch, h2832_20131124.patch, h2832_20131202.patch, h2832_20131203.patch HDFS currently supports configuration where storages are a list of directories. Typically each of these directories correspond to a volume with its own file system. All these directories are homogeneous and therefore identified as a single storage at the namenode. I propose, change to the current model where Datanode * is a * storage, to Datanode * is a collection * of strorages. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HDFS-2832) Enable support for heterogeneous storages in HDFS
[ https://issues.apache.org/jira/browse/HDFS-2832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13841974#comment-13841974 ] Andrew Wang commented on HDFS-2832: --- Most of these questions pertain to phase 2 features, not phase 1. If phase 1 is done and phase 2 starting, it seems now is the appropriate time to be asking phase 2 design questions. I don't have any technical issues with the code in the branch right now. I'd really appreciate if phase 1 and phase 2 (and phase x?) features could be divided up in the design doc though, since I don't think this phased implementation plan is mentioned in there right now. I'm sure it'd help other reviewers too. Enable support for heterogeneous storages in HDFS - Key: HDFS-2832 URL: https://issues.apache.org/jira/browse/HDFS-2832 Project: Hadoop HDFS Issue Type: New Feature Affects Versions: 0.24.0 Reporter: Suresh Srinivas Assignee: Suresh Srinivas Attachments: 20130813-HeterogeneousStorage.pdf, 20131125-HeterogeneousStorage-TestPlan.pdf, 20131125-HeterogeneousStorage.pdf, 20131202-HeterogeneousStorage-TestPlan.pdf, 20131203-HeterogeneousStorage-TestPlan.pdf, H2832_20131107.patch, editsStored, h2832_20131023.patch, h2832_20131023b.patch, h2832_20131025.patch, h2832_20131028.patch, h2832_20131028b.patch, h2832_20131029.patch, h2832_20131103.patch, h2832_20131104.patch, h2832_20131105.patch, h2832_20131107b.patch, h2832_20131108.patch, h2832_20131110.patch, h2832_20131110b.patch, h2832_2013.patch, h2832_20131112.patch, h2832_20131112b.patch, h2832_20131114.patch, h2832_20131118.patch, h2832_20131119.patch, h2832_20131119b.patch, h2832_20131121.patch, h2832_20131122.patch, h2832_20131122b.patch, h2832_20131123.patch, h2832_20131124.patch, h2832_20131202.patch, h2832_20131203.patch HDFS currently supports configuration where storages are a list of directories. Typically each of these directories correspond to a volume with its own file system. All these directories are homogeneous and therefore identified as a single storage at the namenode. I propose, change to the current model where Datanode * is a * storage, to Datanode * is a collection * of strorages. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HDFS-2832) Enable support for heterogeneous storages in HDFS
[ https://issues.apache.org/jira/browse/HDFS-2832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13842027#comment-13842027 ] Arpit Agarwal commented on HDFS-2832: - {quote} Is there any additional overhead from splitting? {quote} DNs are not splitting block reports right now and there is no extra overhead in NN to handle it. {quote} Have you looked at the additional memory overhead on the NN and DN from splitting up storages? With 10 disks on a DN, this could mean effectively 10x the number of DNs as before. I think this is still insignificant, but you all know better than me. {quote} Yes, the bulk of the space is consumed by the block information, the volumes themselves are insignificant. {quote} I'd like to see more description of the client API, namely the file attribute APIs. I'll also note that LocatedBlock is not a public API; you can hack around by downcasting BlockLocation to HdfsBlockLocation to fish out the LocatedBlock, but ultimately we probably want to expose StorageType in BlockLocation itself. API examples would be great, from both the command line and the programmatic API. I think this earlier question/answer didn't make it into the doc: what happens when this file is distcp'd or copied? Arpit's earlier answer of clearing this field makes sense (or maybe we need a cp -a command). {quote} As mentioned earlier we'll document these in more detail once we start work on them i.e. post phase-1 merge. {quote} I'd like to see more discussion in the doc of migrating blocks that are currently open for short-circuit read. SCR is very common {quote} It's in the doc. The operating system, be it Unix or Windows will not let you remove a file as long as any process has a handle open to it. Even if the file looks deleted, it will remain on disk at least until the last open handle goes away. Quota permitting, the application can create additional replicas on alternate storage media while it keeps the existing handle open. {quote} Is this going to work with rolling upgrades? {quote} HDFS does not support rolling upgrades today. {quote} Storage is not always truly hierarchical, it depends on your provisioning. The current strategy of always falling back from SSD to HDD is more ambiguous when you have more than two storage types, especially with something like a tape or NAS tier. Maybe this should be configurable somehow. One of the mentioned potential uses is to do automatic migration between storage types based on usage patterns. In this kind of scenario, it's necessary to support more expressive forms of resource management, e.g. YARN's fair scheduler. Quotas by themselves aren't sufficient. {quote} We have tried to avoid the term 'hierarchical' because we are not adding any awareness of storage hierarchy. HDD is just the most sensible default although as we mention in the future section we can look into adding support for tapes etc. Automatic migration is mentioned for completeness but we are not thinking about the design for it now. Anyone from the community is free to post ideas however. Andrew, the design is the same you reviewed in August and we have discussed some of these earlier so I encourage you to read through the comment history. Enable support for heterogeneous storages in HDFS - Key: HDFS-2832 URL: https://issues.apache.org/jira/browse/HDFS-2832 Project: Hadoop HDFS Issue Type: New Feature Affects Versions: 0.24.0 Reporter: Suresh Srinivas Assignee: Suresh Srinivas Attachments: 20130813-HeterogeneousStorage.pdf, 20131125-HeterogeneousStorage-TestPlan.pdf, 20131125-HeterogeneousStorage.pdf, 20131202-HeterogeneousStorage-TestPlan.pdf, 20131203-HeterogeneousStorage-TestPlan.pdf, H2832_20131107.patch, editsStored, h2832_20131023.patch, h2832_20131023b.patch, h2832_20131025.patch, h2832_20131028.patch, h2832_20131028b.patch, h2832_20131029.patch, h2832_20131103.patch, h2832_20131104.patch, h2832_20131105.patch, h2832_20131107b.patch, h2832_20131108.patch, h2832_20131110.patch, h2832_20131110b.patch, h2832_2013.patch, h2832_20131112.patch, h2832_20131112b.patch, h2832_20131114.patch, h2832_20131118.patch, h2832_20131119.patch, h2832_20131119b.patch, h2832_20131121.patch, h2832_20131122.patch, h2832_20131122b.patch, h2832_20131123.patch, h2832_20131124.patch, h2832_20131202.patch, h2832_20131203.patch HDFS currently supports configuration where storages are a list of directories. Typically each of these directories correspond to a volume with its own file system. All these directories are homogeneous and therefore identified as a single storage at the namenode. I propose, change to the current model where Datanode * is a * storage, to Datanode * is a collection * of strorages. -- This
[jira] [Commented] (HDFS-2832) Enable support for heterogeneous storages in HDFS
[ https://issues.apache.org/jira/browse/HDFS-2832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13837894#comment-13837894 ] Hadoop QA commented on HDFS-2832: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12616810/20131203-HeterogeneousStorage-TestPlan.pdf against trunk revision . {color:red}-1 patch{color}. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/5623//console This message is automatically generated. Enable support for heterogeneous storages in HDFS - Key: HDFS-2832 URL: https://issues.apache.org/jira/browse/HDFS-2832 Project: Hadoop HDFS Issue Type: New Feature Affects Versions: 0.24.0 Reporter: Suresh Srinivas Assignee: Suresh Srinivas Attachments: 20130813-HeterogeneousStorage.pdf, 20131125-HeterogeneousStorage-TestPlan.pdf, 20131125-HeterogeneousStorage.pdf, 20131202-HeterogeneousStorage-TestPlan.pdf, 20131203-HeterogeneousStorage-TestPlan.pdf, H2832_20131107.patch, editsStored, h2832_20131023.patch, h2832_20131023b.patch, h2832_20131025.patch, h2832_20131028.patch, h2832_20131028b.patch, h2832_20131029.patch, h2832_20131103.patch, h2832_20131104.patch, h2832_20131105.patch, h2832_20131107b.patch, h2832_20131108.patch, h2832_20131110.patch, h2832_20131110b.patch, h2832_2013.patch, h2832_20131112.patch, h2832_20131112b.patch, h2832_20131114.patch, h2832_20131118.patch, h2832_20131119.patch, h2832_20131119b.patch, h2832_20131121.patch, h2832_20131122.patch, h2832_20131122b.patch, h2832_20131123.patch, h2832_20131124.patch, h2832_20131202.patch HDFS currently supports configuration where storages are a list of directories. Typically each of these directories correspond to a volume with its own file system. All these directories are homogeneous and therefore identified as a single storage at the namenode. I propose, change to the current model where Datanode * is a * storage, to Datanode * is a collection * of strorages. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HDFS-2832) Enable support for heterogeneous storages in HDFS
[ https://issues.apache.org/jira/browse/HDFS-2832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13838097#comment-13838097 ] Hadoop QA commented on HDFS-2832: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12616817/h2832_20131203.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 48 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.tools.offlineEditsViewer.TestOfflineEditsViewer {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/5624//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/5624//console This message is automatically generated. Enable support for heterogeneous storages in HDFS - Key: HDFS-2832 URL: https://issues.apache.org/jira/browse/HDFS-2832 Project: Hadoop HDFS Issue Type: New Feature Affects Versions: 0.24.0 Reporter: Suresh Srinivas Assignee: Suresh Srinivas Attachments: 20130813-HeterogeneousStorage.pdf, 20131125-HeterogeneousStorage-TestPlan.pdf, 20131125-HeterogeneousStorage.pdf, 20131202-HeterogeneousStorage-TestPlan.pdf, 20131203-HeterogeneousStorage-TestPlan.pdf, H2832_20131107.patch, editsStored, h2832_20131023.patch, h2832_20131023b.patch, h2832_20131025.patch, h2832_20131028.patch, h2832_20131028b.patch, h2832_20131029.patch, h2832_20131103.patch, h2832_20131104.patch, h2832_20131105.patch, h2832_20131107b.patch, h2832_20131108.patch, h2832_20131110.patch, h2832_20131110b.patch, h2832_2013.patch, h2832_20131112.patch, h2832_20131112b.patch, h2832_20131114.patch, h2832_20131118.patch, h2832_20131119.patch, h2832_20131119b.patch, h2832_20131121.patch, h2832_20131122.patch, h2832_20131122b.patch, h2832_20131123.patch, h2832_20131124.patch, h2832_20131202.patch, h2832_20131203.patch HDFS currently supports configuration where storages are a list of directories. Typically each of these directories correspond to a volume with its own file system. All these directories are homogeneous and therefore identified as a single storage at the namenode. I propose, change to the current model where Datanode * is a * storage, to Datanode * is a collection * of strorages. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HDFS-2832) Enable support for heterogeneous storages in HDFS
[ https://issues.apache.org/jira/browse/HDFS-2832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13836852#comment-13836852 ] Hadoop QA commented on HDFS-2832: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12616582/h2832_20131202.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 48 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.tools.offlineEditsViewer.TestOfflineEditsViewer {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/5608//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/5608//console This message is automatically generated. Enable support for heterogeneous storages in HDFS - Key: HDFS-2832 URL: https://issues.apache.org/jira/browse/HDFS-2832 Project: Hadoop HDFS Issue Type: New Feature Affects Versions: 0.24.0 Reporter: Suresh Srinivas Assignee: Suresh Srinivas Attachments: 20130813-HeterogeneousStorage.pdf, 20131125-HeterogeneousStorage-TestPlan.pdf, 20131125-HeterogeneousStorage.pdf, H2832_20131107.patch, editsStored, h2832_20131023.patch, h2832_20131023b.patch, h2832_20131025.patch, h2832_20131028.patch, h2832_20131028b.patch, h2832_20131029.patch, h2832_20131103.patch, h2832_20131104.patch, h2832_20131105.patch, h2832_20131107b.patch, h2832_20131108.patch, h2832_20131110.patch, h2832_20131110b.patch, h2832_2013.patch, h2832_20131112.patch, h2832_20131112b.patch, h2832_20131114.patch, h2832_20131118.patch, h2832_20131119.patch, h2832_20131119b.patch, h2832_20131121.patch, h2832_20131122.patch, h2832_20131122b.patch, h2832_20131123.patch, h2832_20131124.patch, h2832_20131202.patch HDFS currently supports configuration where storages are a list of directories. Typically each of these directories correspond to a volume with its own file system. All these directories are homogeneous and therefore identified as a single storage at the namenode. I propose, change to the current model where Datanode * is a * storage, to Datanode * is a collection * of strorages. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HDFS-2832) Enable support for heterogeneous storages in HDFS
[ https://issues.apache.org/jira/browse/HDFS-2832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13837024#comment-13837024 ] Arpit Agarwal commented on HDFS-2832: - The {{TestOfflineEditsViewer}} failure is expected and can be fixed with the attached editsStored binary file. Enable support for heterogeneous storages in HDFS - Key: HDFS-2832 URL: https://issues.apache.org/jira/browse/HDFS-2832 Project: Hadoop HDFS Issue Type: New Feature Affects Versions: 0.24.0 Reporter: Suresh Srinivas Assignee: Suresh Srinivas Attachments: 20130813-HeterogeneousStorage.pdf, 20131125-HeterogeneousStorage-TestPlan.pdf, 20131125-HeterogeneousStorage.pdf, H2832_20131107.patch, editsStored, h2832_20131023.patch, h2832_20131023b.patch, h2832_20131025.patch, h2832_20131028.patch, h2832_20131028b.patch, h2832_20131029.patch, h2832_20131103.patch, h2832_20131104.patch, h2832_20131105.patch, h2832_20131107b.patch, h2832_20131108.patch, h2832_20131110.patch, h2832_20131110b.patch, h2832_2013.patch, h2832_20131112.patch, h2832_20131112b.patch, h2832_20131114.patch, h2832_20131118.patch, h2832_20131119.patch, h2832_20131119b.patch, h2832_20131121.patch, h2832_20131122.patch, h2832_20131122b.patch, h2832_20131123.patch, h2832_20131124.patch, h2832_20131202.patch HDFS currently supports configuration where storages are a list of directories. Typically each of these directories correspond to a volume with its own file system. All these directories are homogeneous and therefore identified as a single storage at the namenode. I propose, change to the current model where Datanode * is a * storage, to Datanode * is a collection * of strorages. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HDFS-2832) Enable support for heterogeneous storages in HDFS
[ https://issues.apache.org/jira/browse/HDFS-2832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13832323#comment-13832323 ] Hadoop QA commented on HDFS-2832: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12615762/20131125-HeterogeneousStorage-TestPlan.pdf against trunk revision . {color:red}-1 patch{color}. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/5572//console This message is automatically generated. Enable support for heterogeneous storages in HDFS - Key: HDFS-2832 URL: https://issues.apache.org/jira/browse/HDFS-2832 Project: Hadoop HDFS Issue Type: New Feature Affects Versions: 0.24.0 Reporter: Suresh Srinivas Assignee: Suresh Srinivas Attachments: 20130813-HeterogeneousStorage.pdf, 20131125-HeterogeneousStorage-TestPlan.pdf, 20131125-HeterogeneousStorage.pdf, H2832_20131107.patch, editsStored, h2832_20131023.patch, h2832_20131023b.patch, h2832_20131025.patch, h2832_20131028.patch, h2832_20131028b.patch, h2832_20131029.patch, h2832_20131103.patch, h2832_20131104.patch, h2832_20131105.patch, h2832_20131107b.patch, h2832_20131108.patch, h2832_20131110.patch, h2832_20131110b.patch, h2832_2013.patch, h2832_20131112.patch, h2832_20131112b.patch, h2832_20131114.patch, h2832_20131118.patch, h2832_20131119.patch, h2832_20131119b.patch, h2832_20131121.patch, h2832_20131122.patch, h2832_20131122b.patch, h2832_20131123.patch, h2832_20131124.patch HDFS currently supports configuration where storages are a list of directories. Typically each of these directories correspond to a volume with its own file system. All these directories are homogeneous and therefore identified as a single storage at the namenode. I propose, change to the current model where Datanode * is a * storage, to Datanode * is a collection * of strorages. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HDFS-2832) Enable support for heterogeneous storages in HDFS
[ https://issues.apache.org/jira/browse/HDFS-2832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13831144#comment-13831144 ] Hadoop QA commented on HDFS-2832: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12615521/h2832_20131124.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 48 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.tools.offlineEditsViewer.TestOfflineEditsViewer {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/5558//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/5558//console This message is automatically generated. Enable support for heterogeneous storages in HDFS - Key: HDFS-2832 URL: https://issues.apache.org/jira/browse/HDFS-2832 Project: Hadoop HDFS Issue Type: New Feature Affects Versions: 0.24.0 Reporter: Suresh Srinivas Assignee: Suresh Srinivas Attachments: 20130813-HeterogeneousStorage.pdf, H2832_20131107.patch, editsStored, h2832_20131023.patch, h2832_20131023b.patch, h2832_20131025.patch, h2832_20131028.patch, h2832_20131028b.patch, h2832_20131029.patch, h2832_20131103.patch, h2832_20131104.patch, h2832_20131105.patch, h2832_20131107b.patch, h2832_20131108.patch, h2832_20131110.patch, h2832_20131110b.patch, h2832_2013.patch, h2832_20131112.patch, h2832_20131112b.patch, h2832_20131114.patch, h2832_20131118.patch, h2832_20131119.patch, h2832_20131119b.patch, h2832_20131121.patch, h2832_20131122.patch, h2832_20131122b.patch, h2832_20131123.patch, h2832_20131124.patch HDFS currently supports configuration where storages are a list of directories. Typically each of these directories correspond to a volume with its own file system. All these directories are homogeneous and therefore identified as a single storage at the namenode. I propose, change to the current model where Datanode * is a * storage, to Datanode * is a collection * of strorages. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HDFS-2832) Enable support for heterogeneous storages in HDFS
[ https://issues.apache.org/jira/browse/HDFS-2832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13830788#comment-13830788 ] Hadoop QA commented on HDFS-2832: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12615471/h2832_20131123.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 50 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:red}-1 release audit{color}. The applied patch generated 1 release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core: org.apache.hadoop.hdfs.server.namenode.TestCheckpoint org.apache.hadoop.hdfs.tools.offlineEditsViewer.TestOfflineEditsViewer {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/5556//testReport/ Release audit warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/5556//artifact/trunk/patchprocess/patchReleaseAuditProblems.txt Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/5556//console This message is automatically generated. Enable support for heterogeneous storages in HDFS - Key: HDFS-2832 URL: https://issues.apache.org/jira/browse/HDFS-2832 Project: Hadoop HDFS Issue Type: New Feature Affects Versions: 0.24.0 Reporter: Suresh Srinivas Assignee: Suresh Srinivas Attachments: 20130813-HeterogeneousStorage.pdf, H2832_20131107.patch, editsStored, h2832_20131023.patch, h2832_20131023b.patch, h2832_20131025.patch, h2832_20131028.patch, h2832_20131028b.patch, h2832_20131029.patch, h2832_20131103.patch, h2832_20131104.patch, h2832_20131105.patch, h2832_20131107b.patch, h2832_20131108.patch, h2832_20131110.patch, h2832_20131110b.patch, h2832_2013.patch, h2832_20131112.patch, h2832_20131112b.patch, h2832_20131114.patch, h2832_20131118.patch, h2832_20131119.patch, h2832_20131119b.patch, h2832_20131121.patch, h2832_20131122.patch, h2832_20131122b.patch, h2832_20131123.patch HDFS currently supports configuration where storages are a list of directories. Typically each of these directories correspond to a volume with its own file system. All these directories are homogeneous and therefore identified as a single storage at the namenode. I propose, change to the current model where Datanode * is a * storage, to Datanode * is a collection * of strorages. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HDFS-2832) Enable support for heterogeneous storages in HDFS
[ https://issues.apache.org/jira/browse/HDFS-2832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13830406#comment-13830406 ] Hadoop QA commented on HDFS-2832: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12615385/h2832_20131122.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 47 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:red}-1 javadoc{color}. The javadoc tool appears to have generated -12 warning messages. {color:red}-1 eclipse:eclipse{color}. The patch failed to build with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:red}-1 release audit{color}. The applied patch generated 1 release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.tools.offlineEditsViewer.TestOfflineEditsViewer org.apache.hadoop.hdfs.TestDatanodeConfig {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/5544//testReport/ Release audit warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/5544//artifact/trunk/patchprocess/patchReleaseAuditProblems.txt Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/5544//console This message is automatically generated. Enable support for heterogeneous storages in HDFS - Key: HDFS-2832 URL: https://issues.apache.org/jira/browse/HDFS-2832 Project: Hadoop HDFS Issue Type: New Feature Affects Versions: 0.24.0 Reporter: Suresh Srinivas Assignee: Suresh Srinivas Attachments: 20130813-HeterogeneousStorage.pdf, H2832_20131107.patch, editsStored, h2832_20131023.patch, h2832_20131023b.patch, h2832_20131025.patch, h2832_20131028.patch, h2832_20131028b.patch, h2832_20131029.patch, h2832_20131103.patch, h2832_20131104.patch, h2832_20131105.patch, h2832_20131107b.patch, h2832_20131108.patch, h2832_20131110.patch, h2832_20131110b.patch, h2832_2013.patch, h2832_20131112.patch, h2832_20131112b.patch, h2832_20131114.patch, h2832_20131118.patch, h2832_20131119.patch, h2832_20131119b.patch, h2832_20131121.patch, h2832_20131122.patch HDFS currently supports configuration where storages are a list of directories. Typically each of these directories correspond to a volume with its own file system. All these directories are homogeneous and therefore identified as a single storage at the namenode. I propose, change to the current model where Datanode * is a * storage, to Datanode * is a collection * of strorages. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HDFS-2832) Enable support for heterogeneous storages in HDFS
[ https://issues.apache.org/jira/browse/HDFS-2832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13830589#comment-13830589 ] Hadoop QA commented on HDFS-2832: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12615442/h2832_20131122b.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 49 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:red}-1 release audit{color}. The applied patch generated 1 release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core: org.apache.hadoop.hdfs.server.datanode.TestDeleteBlockPool org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover org.apache.hadoop.hdfs.TestDatanodeConfig org.apache.hadoop.hdfs.tools.offlineEditsViewer.TestOfflineEditsViewer {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/5553//testReport/ Release audit warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/5553//artifact/trunk/patchprocess/patchReleaseAuditProblems.txt Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/5553//console This message is automatically generated. Enable support for heterogeneous storages in HDFS - Key: HDFS-2832 URL: https://issues.apache.org/jira/browse/HDFS-2832 Project: Hadoop HDFS Issue Type: New Feature Affects Versions: 0.24.0 Reporter: Suresh Srinivas Assignee: Suresh Srinivas Attachments: 20130813-HeterogeneousStorage.pdf, H2832_20131107.patch, editsStored, h2832_20131023.patch, h2832_20131023b.patch, h2832_20131025.patch, h2832_20131028.patch, h2832_20131028b.patch, h2832_20131029.patch, h2832_20131103.patch, h2832_20131104.patch, h2832_20131105.patch, h2832_20131107b.patch, h2832_20131108.patch, h2832_20131110.patch, h2832_20131110b.patch, h2832_2013.patch, h2832_20131112.patch, h2832_20131112b.patch, h2832_20131114.patch, h2832_20131118.patch, h2832_20131119.patch, h2832_20131119b.patch, h2832_20131121.patch, h2832_20131122.patch, h2832_20131122b.patch HDFS currently supports configuration where storages are a list of directories. Typically each of these directories correspond to a volume with its own file system. All these directories are homogeneous and therefore identified as a single storage at the namenode. I propose, change to the current model where Datanode * is a * storage, to Datanode * is a collection * of strorages. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HDFS-2832) Enable support for heterogeneous storages in HDFS
[ https://issues.apache.org/jira/browse/HDFS-2832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13829486#comment-13829486 ] Hadoop QA commented on HDFS-2832: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12615196/h2832_20131121.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 45 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:red}-1 javadoc{color}. The javadoc tool appears to have generated -2 warning messages. {color:red}-1 eclipse:eclipse{color}. The patch failed to build with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.tools.offlineEditsViewer.TestOfflineEditsViewer {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/5532//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/5532//console This message is automatically generated. Enable support for heterogeneous storages in HDFS - Key: HDFS-2832 URL: https://issues.apache.org/jira/browse/HDFS-2832 Project: Hadoop HDFS Issue Type: New Feature Affects Versions: 0.24.0 Reporter: Suresh Srinivas Assignee: Suresh Srinivas Attachments: 20130813-HeterogeneousStorage.pdf, H2832_20131107.patch, editsStored, h2832_20131023.patch, h2832_20131023b.patch, h2832_20131025.patch, h2832_20131028.patch, h2832_20131028b.patch, h2832_20131029.patch, h2832_20131103.patch, h2832_20131104.patch, h2832_20131105.patch, h2832_20131107b.patch, h2832_20131108.patch, h2832_20131110.patch, h2832_20131110b.patch, h2832_2013.patch, h2832_20131112.patch, h2832_20131112b.patch, h2832_20131114.patch, h2832_20131118.patch, h2832_20131119.patch, h2832_20131119b.patch, h2832_20131121.patch HDFS currently supports configuration where storages are a list of directories. Typically each of these directories correspond to a volume with its own file system. All these directories are homogeneous and therefore identified as a single storage at the namenode. I propose, change to the current model where Datanode * is a * storage, to Datanode * is a collection * of strorages. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HDFS-2832) Enable support for heterogeneous storages in HDFS
[ https://issues.apache.org/jira/browse/HDFS-2832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13826875#comment-13826875 ] Hadoop QA commented on HDFS-2832: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12614660/h2832_20131119.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 45 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:red}-1 release audit{color}. The applied patch generated 1 release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.server.namenode.ha.TestHASafeMode org.apache.hadoop.hdfs.server.blockmanagement.TestUnderReplicatedBlocks org.apache.hadoop.hdfs.tools.offlineEditsViewer.TestOfflineEditsViewer {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/5488//testReport/ Release audit warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/5488//artifact/trunk/patchprocess/patchReleaseAuditProblems.txt Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/5488//console This message is automatically generated. Enable support for heterogeneous storages in HDFS - Key: HDFS-2832 URL: https://issues.apache.org/jira/browse/HDFS-2832 Project: Hadoop HDFS Issue Type: New Feature Affects Versions: 0.24.0 Reporter: Suresh Srinivas Assignee: Suresh Srinivas Attachments: 20130813-HeterogeneousStorage.pdf, H2832_20131107.patch, editsStored, h2832_20131023.patch, h2832_20131023b.patch, h2832_20131025.patch, h2832_20131028.patch, h2832_20131028b.patch, h2832_20131029.patch, h2832_20131103.patch, h2832_20131104.patch, h2832_20131105.patch, h2832_20131107b.patch, h2832_20131108.patch, h2832_20131110.patch, h2832_20131110b.patch, h2832_2013.patch, h2832_20131112.patch, h2832_20131112b.patch, h2832_20131114.patch, h2832_20131118.patch, h2832_20131119.patch HDFS currently supports configuration where storages are a list of directories. Typically each of these directories correspond to a volume with its own file system. All these directories are homogeneous and therefore identified as a single storage at the namenode. I propose, change to the current model where Datanode * is a * storage, to Datanode * is a collection * of strorages. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HDFS-2832) Enable support for heterogeneous storages in HDFS
[ https://issues.apache.org/jira/browse/HDFS-2832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13827418#comment-13827418 ] Hadoop QA commented on HDFS-2832: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12614794/h2832_20131119b.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 45 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.tools.offlineEditsViewer.TestOfflineEditsViewer {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/5499//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/5499//console This message is automatically generated. Enable support for heterogeneous storages in HDFS - Key: HDFS-2832 URL: https://issues.apache.org/jira/browse/HDFS-2832 Project: Hadoop HDFS Issue Type: New Feature Affects Versions: 0.24.0 Reporter: Suresh Srinivas Assignee: Suresh Srinivas Attachments: 20130813-HeterogeneousStorage.pdf, H2832_20131107.patch, editsStored, h2832_20131023.patch, h2832_20131023b.patch, h2832_20131025.patch, h2832_20131028.patch, h2832_20131028b.patch, h2832_20131029.patch, h2832_20131103.patch, h2832_20131104.patch, h2832_20131105.patch, h2832_20131107b.patch, h2832_20131108.patch, h2832_20131110.patch, h2832_20131110b.patch, h2832_2013.patch, h2832_20131112.patch, h2832_20131112b.patch, h2832_20131114.patch, h2832_20131118.patch, h2832_20131119.patch, h2832_20131119b.patch HDFS currently supports configuration where storages are a list of directories. Typically each of these directories correspond to a volume with its own file system. All these directories are homogeneous and therefore identified as a single storage at the namenode. I propose, change to the current model where Datanode * is a * storage, to Datanode * is a collection * of strorages. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HDFS-2832) Enable support for heterogeneous storages in HDFS
[ https://issues.apache.org/jira/browse/HDFS-2832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13825796#comment-13825796 ] Hadoop QA commented on HDFS-2832: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12614486/h2832_20131118.patch against trunk revision . {color:red}-1 patch{color}. Trunk compilation may be broken. Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/5464//console This message is automatically generated. Enable support for heterogeneous storages in HDFS - Key: HDFS-2832 URL: https://issues.apache.org/jira/browse/HDFS-2832 Project: Hadoop HDFS Issue Type: New Feature Affects Versions: 0.24.0 Reporter: Suresh Srinivas Assignee: Suresh Srinivas Attachments: 20130813-HeterogeneousStorage.pdf, H2832_20131107.patch, editsStored, h2832_20131023.patch, h2832_20131023b.patch, h2832_20131025.patch, h2832_20131028.patch, h2832_20131028b.patch, h2832_20131029.patch, h2832_20131103.patch, h2832_20131104.patch, h2832_20131105.patch, h2832_20131107b.patch, h2832_20131108.patch, h2832_20131110.patch, h2832_20131110b.patch, h2832_2013.patch, h2832_20131112.patch, h2832_20131112b.patch, h2832_20131114.patch, h2832_20131118.patch HDFS currently supports configuration where storages are a list of directories. Typically each of these directories correspond to a volume with its own file system. All these directories are homogeneous and therefore identified as a single storage at the namenode. I propose, change to the current model where Datanode * is a * storage, to Datanode * is a collection * of strorages. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HDFS-2832) Enable support for heterogeneous storages in HDFS
[ https://issues.apache.org/jira/browse/HDFS-2832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13826214#comment-13826214 ] Hadoop QA commented on HDFS-2832: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12614486/h2832_20131118.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 45 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:red}-1 release audit{color}. The applied patch generated 1 release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.server.blockmanagement.TestUnderReplicatedBlocks org.apache.hadoop.hdfs.tools.offlineEditsViewer.TestOfflineEditsViewer {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/5473//testReport/ Release audit warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/5473//artifact/trunk/patchprocess/patchReleaseAuditProblems.txt Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/5473//console This message is automatically generated. Enable support for heterogeneous storages in HDFS - Key: HDFS-2832 URL: https://issues.apache.org/jira/browse/HDFS-2832 Project: Hadoop HDFS Issue Type: New Feature Affects Versions: 0.24.0 Reporter: Suresh Srinivas Assignee: Suresh Srinivas Attachments: 20130813-HeterogeneousStorage.pdf, H2832_20131107.patch, editsStored, h2832_20131023.patch, h2832_20131023b.patch, h2832_20131025.patch, h2832_20131028.patch, h2832_20131028b.patch, h2832_20131029.patch, h2832_20131103.patch, h2832_20131104.patch, h2832_20131105.patch, h2832_20131107b.patch, h2832_20131108.patch, h2832_20131110.patch, h2832_20131110b.patch, h2832_2013.patch, h2832_20131112.patch, h2832_20131112b.patch, h2832_20131114.patch, h2832_20131118.patch HDFS currently supports configuration where storages are a list of directories. Typically each of these directories correspond to a volume with its own file system. All these directories are homogeneous and therefore identified as a single storage at the namenode. I propose, change to the current model where Datanode * is a * storage, to Datanode * is a collection * of strorages. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HDFS-2832) Enable support for heterogeneous storages in HDFS
[ https://issues.apache.org/jira/browse/HDFS-2832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13826216#comment-13826216 ] Junping Du commented on HDFS-2832: -- Filed HDFS-5527 to fix TestUnderReplicatedBlocks (already have a quick fix patch). The failure of TestOfflineEditsViewer is due to missing of new binary file of editsStored. Enable support for heterogeneous storages in HDFS - Key: HDFS-2832 URL: https://issues.apache.org/jira/browse/HDFS-2832 Project: Hadoop HDFS Issue Type: New Feature Affects Versions: 0.24.0 Reporter: Suresh Srinivas Assignee: Suresh Srinivas Attachments: 20130813-HeterogeneousStorage.pdf, H2832_20131107.patch, editsStored, h2832_20131023.patch, h2832_20131023b.patch, h2832_20131025.patch, h2832_20131028.patch, h2832_20131028b.patch, h2832_20131029.patch, h2832_20131103.patch, h2832_20131104.patch, h2832_20131105.patch, h2832_20131107b.patch, h2832_20131108.patch, h2832_20131110.patch, h2832_20131110b.patch, h2832_2013.patch, h2832_20131112.patch, h2832_20131112b.patch, h2832_20131114.patch, h2832_20131118.patch HDFS currently supports configuration where storages are a list of directories. Typically each of these directories correspond to a volume with its own file system. All these directories are homogeneous and therefore identified as a single storage at the namenode. I propose, change to the current model where Datanode * is a * storage, to Datanode * is a collection * of strorages. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HDFS-2832) Enable support for heterogeneous storages in HDFS
[ https://issues.apache.org/jira/browse/HDFS-2832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13824253#comment-13824253 ] Arpit Agarwal commented on HDFS-2832: - {quote} My point though is about the deterministic nature of pseudo random sequences. In the sense that if you initiate two independent PRNGs with the same seed, they will generate exactly the same sequences. {quote} Seed collision is even less likely than outright UUID collision. The seed space for SHA1PRNG is 2^160. Enable support for heterogeneous storages in HDFS - Key: HDFS-2832 URL: https://issues.apache.org/jira/browse/HDFS-2832 Project: Hadoop HDFS Issue Type: New Feature Affects Versions: 0.24.0 Reporter: Suresh Srinivas Assignee: Suresh Srinivas Attachments: 20130813-HeterogeneousStorage.pdf, H2832_20131107.patch, editsStored, h2832_20131023.patch, h2832_20131023b.patch, h2832_20131025.patch, h2832_20131028.patch, h2832_20131028b.patch, h2832_20131029.patch, h2832_20131103.patch, h2832_20131104.patch, h2832_20131105.patch, h2832_20131107b.patch, h2832_20131108.patch, h2832_20131110.patch, h2832_20131110b.patch, h2832_2013.patch, h2832_20131112.patch, h2832_20131112b.patch, h2832_20131114.patch HDFS currently supports configuration where storages are a list of directories. Typically each of these directories correspond to a volume with its own file system. All these directories are homogeneous and therefore identified as a single storage at the namenode. I propose, change to the current model where Datanode * is a * storage, to Datanode * is a collection * of strorages. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HDFS-2832) Enable support for heterogeneous storages in HDFS
[ https://issues.apache.org/jira/browse/HDFS-2832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13822352#comment-13822352 ] Konstantin Shvachko commented on HDFS-2832: --- The probability of no collision is P = n!/((n-m)! n^m). Nicholas, your math is good, as usually. It works well for random events. My point though is about the deterministic nature of pseudo random sequences. In the sense that if you initiate two independent PRNGs with the same seed, they will generate exactly the same sequences. Sorry for late reply, travelling. Enable support for heterogeneous storages in HDFS - Key: HDFS-2832 URL: https://issues.apache.org/jira/browse/HDFS-2832 Project: Hadoop HDFS Issue Type: New Feature Affects Versions: 0.24.0 Reporter: Suresh Srinivas Assignee: Suresh Srinivas Attachments: 20130813-HeterogeneousStorage.pdf, H2832_20131107.patch, editsStored, h2832_20131023.patch, h2832_20131023b.patch, h2832_20131025.patch, h2832_20131028.patch, h2832_20131028b.patch, h2832_20131029.patch, h2832_20131103.patch, h2832_20131104.patch, h2832_20131105.patch, h2832_20131107b.patch, h2832_20131108.patch, h2832_20131110.patch, h2832_20131110b.patch, h2832_2013.patch, h2832_20131112.patch, h2832_20131112b.patch HDFS currently supports configuration where storages are a list of directories. Typically each of these directories correspond to a volume with its own file system. All these directories are homogeneous and therefore identified as a single storage at the namenode. I propose, change to the current model where Datanode * is a * storage, to Datanode * is a collection * of strorages. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HDFS-2832) Enable support for heterogeneous storages in HDFS
[ https://issues.apache.org/jira/browse/HDFS-2832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13823237#comment-13823237 ] Hadoop QA commented on HDFS-2832: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12613979/h2832_20131114.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 44 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:red}-1 release audit{color}. The applied patch generated 1 release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.server.blockmanagement.TestUnderReplicatedBlocks org.apache.hadoop.hdfs.server.namenode.TestListCorruptFileBlocks org.apache.hadoop.hdfs.tools.offlineEditsViewer.TestOfflineEditsViewer {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/5440//testReport/ Release audit warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/5440//artifact/trunk/patchprocess/patchReleaseAuditProblems.txt Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/5440//console This message is automatically generated. Enable support for heterogeneous storages in HDFS - Key: HDFS-2832 URL: https://issues.apache.org/jira/browse/HDFS-2832 Project: Hadoop HDFS Issue Type: New Feature Affects Versions: 0.24.0 Reporter: Suresh Srinivas Assignee: Suresh Srinivas Attachments: 20130813-HeterogeneousStorage.pdf, H2832_20131107.patch, editsStored, h2832_20131023.patch, h2832_20131023b.patch, h2832_20131025.patch, h2832_20131028.patch, h2832_20131028b.patch, h2832_20131029.patch, h2832_20131103.patch, h2832_20131104.patch, h2832_20131105.patch, h2832_20131107b.patch, h2832_20131108.patch, h2832_20131110.patch, h2832_20131110b.patch, h2832_2013.patch, h2832_20131112.patch, h2832_20131112b.patch, h2832_20131114.patch HDFS currently supports configuration where storages are a list of directories. Typically each of these directories correspond to a volume with its own file system. All these directories are homogeneous and therefore identified as a single storage at the namenode. I propose, change to the current model where Datanode * is a * storage, to Datanode * is a collection * of strorages. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HDFS-2832) Enable support for heterogeneous storages in HDFS
[ https://issues.apache.org/jira/browse/HDFS-2832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13821047#comment-13821047 ] Junping Du commented on HDFS-2832: -- The unit test failure may due to no cleanup as I still see old 48 version in edit xml in Jenkins test log (my local env is OK) Filed HDFS-5510 to fix a findbug warning. The audit warning can be ignored. Enable support for heterogeneous storages in HDFS - Key: HDFS-2832 URL: https://issues.apache.org/jira/browse/HDFS-2832 Project: Hadoop HDFS Issue Type: New Feature Affects Versions: 0.24.0 Reporter: Suresh Srinivas Assignee: Suresh Srinivas Attachments: 20130813-HeterogeneousStorage.pdf, H2832_20131107.patch, editsStored, h2832_20131023.patch, h2832_20131023b.patch, h2832_20131025.patch, h2832_20131028.patch, h2832_20131028b.patch, h2832_20131029.patch, h2832_20131103.patch, h2832_20131104.patch, h2832_20131105.patch, h2832_20131107b.patch, h2832_20131108.patch, h2832_20131110.patch, h2832_20131110b.patch, h2832_2013.patch, h2832_20131112.patch, h2832_20131112b.patch HDFS currently supports configuration where storages are a list of directories. Typically each of these directories correspond to a volume with its own file system. All these directories are homogeneous and therefore identified as a single storage at the namenode. I propose, change to the current model where Datanode * is a * storage, to Datanode * is a collection * of strorages. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HDFS-2832) Enable support for heterogeneous storages in HDFS
[ https://issues.apache.org/jira/browse/HDFS-2832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13821541#comment-13821541 ] Arpit Agarwal commented on HDFS-2832: - TestOfflineEditsViewer is due to missing binary file editsStored. TestListCorruptFileBlocks looks like spurious timeout, I cannot duplicate the failure. Looking at TestDFSStartupVersions. Enable support for heterogeneous storages in HDFS - Key: HDFS-2832 URL: https://issues.apache.org/jira/browse/HDFS-2832 Project: Hadoop HDFS Issue Type: New Feature Affects Versions: 0.24.0 Reporter: Suresh Srinivas Assignee: Suresh Srinivas Attachments: 20130813-HeterogeneousStorage.pdf, H2832_20131107.patch, editsStored, h2832_20131023.patch, h2832_20131023b.patch, h2832_20131025.patch, h2832_20131028.patch, h2832_20131028b.patch, h2832_20131029.patch, h2832_20131103.patch, h2832_20131104.patch, h2832_20131105.patch, h2832_20131107b.patch, h2832_20131108.patch, h2832_20131110.patch, h2832_20131110b.patch, h2832_2013.patch, h2832_20131112.patch, h2832_20131112b.patch HDFS currently supports configuration where storages are a list of directories. Typically each of these directories correspond to a volume with its own file system. All these directories are homogeneous and therefore identified as a single storage at the namenode. I propose, change to the current model where Datanode * is a * storage, to Datanode * is a collection * of strorages. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HDFS-2832) Enable support for heterogeneous storages in HDFS
[ https://issues.apache.org/jira/browse/HDFS-2832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13820698#comment-13820698 ] Hadoop QA commented on HDFS-2832: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12613434/h2832_20131112.patch against trunk revision . {color:red}-1 patch{color}. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/5408//console This message is automatically generated. Enable support for heterogeneous storages in HDFS - Key: HDFS-2832 URL: https://issues.apache.org/jira/browse/HDFS-2832 Project: Hadoop HDFS Issue Type: New Feature Affects Versions: 0.24.0 Reporter: Suresh Srinivas Assignee: Suresh Srinivas Attachments: 20130813-HeterogeneousStorage.pdf, H2832_20131107.patch, editsStored, h2832_20131023.patch, h2832_20131023b.patch, h2832_20131025.patch, h2832_20131028.patch, h2832_20131028b.patch, h2832_20131029.patch, h2832_20131103.patch, h2832_20131104.patch, h2832_20131105.patch, h2832_20131107b.patch, h2832_20131108.patch, h2832_20131110.patch, h2832_20131110b.patch, h2832_2013.patch, h2832_20131112.patch HDFS currently supports configuration where storages are a list of directories. Typically each of these directories correspond to a volume with its own file system. All these directories are homogeneous and therefore identified as a single storage at the namenode. I propose, change to the current model where Datanode * is a * storage, to Datanode * is a collection * of strorages. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HDFS-2832) Enable support for heterogeneous storages in HDFS
[ https://issues.apache.org/jira/browse/HDFS-2832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13820973#comment-13820973 ] Hadoop QA commented on HDFS-2832: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12613511/h2832_20131112b.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 43 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:red}-1 findbugs{color}. The patch appears to introduce 1 new Findbugs (version 1.3.9) warnings. {color:red}-1 release audit{color}. The applied patch generated 1 release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.tools.offlineEditsViewer.TestOfflineEditsViewer The following test timeouts occurred in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.TestDFSStartupVersions org.apache.hadoop.hdfs.server.namenode.TestListCorruptFileBlocks {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/5414//testReport/ Release audit warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/5414//artifact/trunk/patchprocess/patchReleaseAuditProblems.txt Findbugs warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/5414//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/5414//console This message is automatically generated. Enable support for heterogeneous storages in HDFS - Key: HDFS-2832 URL: https://issues.apache.org/jira/browse/HDFS-2832 Project: Hadoop HDFS Issue Type: New Feature Affects Versions: 0.24.0 Reporter: Suresh Srinivas Assignee: Suresh Srinivas Attachments: 20130813-HeterogeneousStorage.pdf, H2832_20131107.patch, editsStored, h2832_20131023.patch, h2832_20131023b.patch, h2832_20131025.patch, h2832_20131028.patch, h2832_20131028b.patch, h2832_20131029.patch, h2832_20131103.patch, h2832_20131104.patch, h2832_20131105.patch, h2832_20131107b.patch, h2832_20131108.patch, h2832_20131110.patch, h2832_20131110b.patch, h2832_2013.patch, h2832_20131112.patch, h2832_20131112b.patch HDFS currently supports configuration where storages are a list of directories. Typically each of these directories correspond to a volume with its own file system. All these directories are homogeneous and therefore identified as a single storage at the namenode. I propose, change to the current model where Datanode * is a * storage, to Datanode * is a collection * of strorages. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HDFS-2832) Enable support for heterogeneous storages in HDFS
[ https://issues.apache.org/jira/browse/HDFS-2832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13818797#comment-13818797 ] Hadoop QA commented on HDFS-2832: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12613098/h2832_20131110b.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 43 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:red}-1 findbugs{color}. The patch appears to introduce 9 new Findbugs (version 1.3.9) warnings. {color:red}-1 release audit{color}. The applied patch generated 1 release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.tools.offlineEditsViewer.TestOfflineEditsViewer org.apache.hadoop.hdfs.server.namenode.ha.TestStandbyIsHot The following test timeouts occurred in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.TestDFSStartupVersions org.apache.hadoop.hdfs.server.namenode.TestListCorruptFileBlocks {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/5384//testReport/ Release audit warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/5384//artifact/trunk/patchprocess/patchReleaseAuditProblems.txt Findbugs warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/5384//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/5384//console This message is automatically generated. Enable support for heterogeneous storages in HDFS - Key: HDFS-2832 URL: https://issues.apache.org/jira/browse/HDFS-2832 Project: Hadoop HDFS Issue Type: New Feature Affects Versions: 0.24.0 Reporter: Suresh Srinivas Assignee: Suresh Srinivas Attachments: 20130813-HeterogeneousStorage.pdf, H2832_20131107.patch, h2832_20131023.patch, h2832_20131023b.patch, h2832_20131025.patch, h2832_20131028.patch, h2832_20131028b.patch, h2832_20131029.patch, h2832_20131103.patch, h2832_20131104.patch, h2832_20131105.patch, h2832_20131107b.patch, h2832_20131108.patch, h2832_20131110.patch, h2832_20131110b.patch HDFS currently supports configuration where storages are a list of directories. Typically each of these directories correspond to a volume with its own file system. All these directories are homogeneous and therefore identified as a single storage at the namenode. I propose, change to the current model where Datanode * is a * storage, to Datanode * is a collection * of strorages. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HDFS-2832) Enable support for heterogeneous storages in HDFS
[ https://issues.apache.org/jira/browse/HDFS-2832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13818628#comment-13818628 ] Hadoop QA commented on HDFS-2832: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12613063/h2832_20131110.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 42 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:red}-1 findbugs{color}. The patch appears to introduce 9 new Findbugs (version 1.3.9) warnings. {color:red}-1 release audit{color}. The applied patch generated 1 release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.server.namenode.ha.TestStandbyIsHot org.apache.hadoop.hdfs.server.blockmanagement.TestUnderReplicatedBlocks org.apache.hadoop.hdfs.tools.offlineImageViewer.TestOfflineImageViewer org.apache.hadoop.hdfs.tools.offlineEditsViewer.TestOfflineEditsViewer org.apache.hadoop.hdfs.server.namenode.metrics.TestNameNodeMetrics org.apache.hadoop.hdfs.server.namenode.snapshot.TestSnapshot The following test timeouts occurred in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.TestDFSStartupVersions org.apache.hadoop.hdfs.server.namenode.TestListCorruptFileBlocks {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/5380//testReport/ Release audit warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/5380//artifact/trunk/patchprocess/patchReleaseAuditProblems.txt Findbugs warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/5380//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/5380//console This message is automatically generated. Enable support for heterogeneous storages in HDFS - Key: HDFS-2832 URL: https://issues.apache.org/jira/browse/HDFS-2832 Project: Hadoop HDFS Issue Type: New Feature Affects Versions: 0.24.0 Reporter: Suresh Srinivas Assignee: Suresh Srinivas Attachments: 20130813-HeterogeneousStorage.pdf, H2832_20131107.patch, h2832_20131023.patch, h2832_20131023b.patch, h2832_20131025.patch, h2832_20131028.patch, h2832_20131028b.patch, h2832_20131029.patch, h2832_20131103.patch, h2832_20131104.patch, h2832_20131105.patch, h2832_20131107b.patch, h2832_20131108.patch, h2832_20131110.patch HDFS currently supports configuration where storages are a list of directories. Typically each of these directories correspond to a volume with its own file system. All these directories are homogeneous and therefore identified as a single storage at the namenode. I propose, change to the current model where Datanode * is a * storage, to Datanode * is a collection * of strorages. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HDFS-2832) Enable support for heterogeneous storages in HDFS
[ https://issues.apache.org/jira/browse/HDFS-2832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13816394#comment-13816394 ] Tsz Wo (Nicholas), SZE commented on HDFS-2832: -- With a billion nodes the probability of a collision in a 128-bit space is less than 1 in 10^20. ... Let n be the number of possible IDs. Let m be the number of nodes. The probability of no collision is P = n!/((n-m)! n^m). Put n=2^128 and m=10^9, we have * P ~= 0.853063206294150856 The probability of collision is * 1-P ~= 1.4693679370584914464 * 10^(-21) 10^(-20). However, randomly generated UUIDs only have 122 random bits accoring to [Wikipedia|http://en.wikipedia.org/wiki/UUID#Random_UUID_probability_of_duplicates]. Now put n=2^122 and m=10^9, we have * P ~= 0.9990596045202825654743 The probability of collision is * 1-P ~= 9.403954797174345257 * 10^(-20) 10^(-19) Similar result can be obtained using approximation P ~= exp(-m^2/(2*n)). Enable support for heterogeneous storages in HDFS - Key: HDFS-2832 URL: https://issues.apache.org/jira/browse/HDFS-2832 Project: Hadoop HDFS Issue Type: New Feature Affects Versions: 0.24.0 Reporter: Suresh Srinivas Assignee: Suresh Srinivas Attachments: 20130813-HeterogeneousStorage.pdf, H2832_20131107.patch, h2832_20131023.patch, h2832_20131023b.patch, h2832_20131025.patch, h2832_20131028.patch, h2832_20131028b.patch, h2832_20131029.patch, h2832_20131103.patch, h2832_20131104.patch, h2832_20131105.patch HDFS currently supports configuration where storages are a list of directories. Typically each of these directories correspond to a volume with its own file system. All these directories are homogeneous and therefore identified as a single storage at the namenode. I propose, change to the current model where Datanode * is a * storage, to Datanode * is a collection * of strorages. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HDFS-2832) Enable support for heterogeneous storages in HDFS
[ https://issues.apache.org/jira/browse/HDFS-2832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13816397#comment-13816397 ] Tsz Wo (Nicholas), SZE commented on HDFS-2832: -- ... Even though unlikely, a collision if it happens creates a serious problem for the system integrity. Does it concern you? It depends on how small the probability is - certainly not for 10^(-19). - Below is quoted from [Wikipedia|http://en.wikipedia.org/wiki/UUID#Random_UUID_probability_of_duplicates] {quote} To put these numbers into perspective, the annual risk of someone being hit by a meteorite is estimated to be one chance in 17 billion, which means the probability is about 0.006 (6 × 10^(−11)), equivalent to the odds of creating a few tens of trillions of UUIDs in a year and having one duplicate. In other words, only after generating 1 billion UUIDs every second for the next 100 years, the probability of creating just one duplicate would be about 50%. The probability of one duplicate would be about 50% if every person on earth owns 600 million UUIDs. {quote} - I beg you have heard [risk of cosmic rays|http://stackoverflow.com/questions/2580933/cosmic-rays-what-is-the-probability-they-will-affect-a-program] argurment. Enable support for heterogeneous storages in HDFS - Key: HDFS-2832 URL: https://issues.apache.org/jira/browse/HDFS-2832 Project: Hadoop HDFS Issue Type: New Feature Affects Versions: 0.24.0 Reporter: Suresh Srinivas Assignee: Suresh Srinivas Attachments: 20130813-HeterogeneousStorage.pdf, H2832_20131107.patch, h2832_20131023.patch, h2832_20131023b.patch, h2832_20131025.patch, h2832_20131028.patch, h2832_20131028b.patch, h2832_20131029.patch, h2832_20131103.patch, h2832_20131104.patch, h2832_20131105.patch HDFS currently supports configuration where storages are a list of directories. Typically each of these directories correspond to a volume with its own file system. All these directories are homogeneous and therefore identified as a single storage at the namenode. I propose, change to the current model where Datanode * is a * storage, to Datanode * is a collection * of strorages. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HDFS-2832) Enable support for heterogeneous storages in HDFS
[ https://issues.apache.org/jira/browse/HDFS-2832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13815376#comment-13815376 ] Konstantin Shvachko commented on HDFS-2832: --- Arpit, I think we just agreed that collisions among UUIDs are possible but have low probability. This is a concern for me. Even though unlikely, a collision if it happens creates a serious problem for the system integrity. Does it concern you? In my previous comment I tried to explain that in distributed case the randomness of it is the main problem. Forget for a moment about PRNGs. Assume that UUID is an incremental counter (such as generation stamp (and now block id)), which is incremented by each node independently but at start up each chooses a randomly number to start from. On a single node ++ can go on without collisions for a long enough time to guarantee I will never see it. Y4K bug is fine with me. But if you take the second node and randomly choose a starting number it could be close to (1000 apart) the starting point of the first node. Then the second node can only generate 1000 storageIDs before colliding with those generated by the other node. The same is with PRNG you just replace ++ with next(). Long period doesn't matter if you choose your starting points randomly. Enable support for heterogeneous storages in HDFS - Key: HDFS-2832 URL: https://issues.apache.org/jira/browse/HDFS-2832 Project: Hadoop HDFS Issue Type: New Feature Affects Versions: 0.24.0 Reporter: Suresh Srinivas Assignee: Suresh Srinivas Attachments: 20130813-HeterogeneousStorage.pdf, h2832_20131023.patch, h2832_20131023b.patch, h2832_20131025.patch, h2832_20131028.patch, h2832_20131028b.patch, h2832_20131029.patch, h2832_20131103.patch, h2832_20131104.patch, h2832_20131105.patch HDFS currently supports configuration where storages are a list of directories. Typically each of these directories correspond to a volume with its own file system. All these directories are homogeneous and therefore identified as a single storage at the namenode. I propose, change to the current model where Datanode * is a * storage, to Datanode * is a collection * of strorages. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HDFS-2832) Enable support for heterogeneous storages in HDFS
[ https://issues.apache.org/jira/browse/HDFS-2832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13813784#comment-13813784 ] Konstantin Shvachko commented on HDFS-2832: --- UUID#randomUUID generates RFC-4122 compliant UUIDs which are unique *for all practical purposes* RFC-4122 has a special note about distributed applications. But let's just think about it in general. randomUUID is based on pseudo random sequence of numbers, which is like a Mobius Strip or just a loop. It actually works well if you generate IDs on a single node, because the sequence lasts long without repetitions. In our case we initiate thousands of pseudo random sequences (one per node), each starting from a random number. Let's mark those starting numbers on the Mobius Strip or the loop. Then we actually decreased the probability of uniqueness because now in order to get a collision one of the nodes need to reach the starting point of another node, rather than going all around the loop. So in distributed environment we increase the probability of collision with each new node added. And when you add more storage types per node you further increase the collision probability. for all practical purposes as I understand it in the case means that probability of non-unique IDs is low. But it does not mean impossible. The consequences of a storageID collision are pretty bad, hard to detect and recover. At the same time {{DataNode.createNewStorageId()}} generates unique IDs as of today. Why changing it to a problematic approach? Part of the rationale is in HDFS-5115. Making them UUIDs simplifies the generation logic. Looks like HDFS-5115 was based on an incomplete assumption: bq. The Storage ID is currently generated from the DataNode's IP+Port+Random components while in fact it also includes currentTime, which guarantees the uniqueness of ids generated on the same node, unless somebody resets the machine clock to the past. Enable support for heterogeneous storages in HDFS - Key: HDFS-2832 URL: https://issues.apache.org/jira/browse/HDFS-2832 Project: Hadoop HDFS Issue Type: New Feature Affects Versions: 0.24.0 Reporter: Suresh Srinivas Assignee: Suresh Srinivas Attachments: 20130813-HeterogeneousStorage.pdf, h2832_20131023.patch, h2832_20131023b.patch, h2832_20131025.patch, h2832_20131028.patch, h2832_20131028b.patch, h2832_20131029.patch, h2832_20131103.patch, h2832_20131104.patch HDFS currently supports configuration where storages are a list of directories. Typically each of these directories correspond to a volume with its own file system. All these directories are homogeneous and therefore identified as a single storage at the namenode. I propose, change to the current model where Datanode * is a * storage, to Datanode * is a collection * of strorages. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HDFS-2832) Enable support for heterogeneous storages in HDFS
[ https://issues.apache.org/jira/browse/HDFS-2832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13814025#comment-13814025 ] Arpit Agarwal commented on HDFS-2832: - Konstantin, UUID generation uses a cryptographically secure PRNG. On Linux this is /dev/random, the fallback is SHA1PRNG with a period of 2^160. With a billion nodes the probability of a collision in a 128-bit space is less than 1 in 10^20. Note that what was previously the storageID is now the datanode UUID and it is generated once for the lifetime of a datanode. Enable support for heterogeneous storages in HDFS - Key: HDFS-2832 URL: https://issues.apache.org/jira/browse/HDFS-2832 Project: Hadoop HDFS Issue Type: New Feature Affects Versions: 0.24.0 Reporter: Suresh Srinivas Assignee: Suresh Srinivas Attachments: 20130813-HeterogeneousStorage.pdf, h2832_20131023.patch, h2832_20131023b.patch, h2832_20131025.patch, h2832_20131028.patch, h2832_20131028b.patch, h2832_20131029.patch, h2832_20131103.patch, h2832_20131104.patch HDFS currently supports configuration where storages are a list of directories. Typically each of these directories correspond to a volume with its own file system. All these directories are homogeneous and therefore identified as a single storage at the namenode. I propose, change to the current model where Datanode * is a * storage, to Datanode * is a collection * of strorages. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HDFS-2832) Enable support for heterogeneous storages in HDFS
[ https://issues.apache.org/jira/browse/HDFS-2832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13813495#comment-13813495 ] Konstantin Shvachko commented on HDFS-2832: --- Hey guys, I was wondering if we really need to change storageID to UUID. I thought that the storageID approach that _each DN is able to generate a unique id independently of the others_ is a good feature to retain. UUID as you noted is not unique and needs to be coordinated through NameNode. I understand you have multiple storages on the same DN, and you need unique ids independently of the ip, and port. # They should be unique with existing implementation of {{createNewStorageId()}}. {code}storageid = random, ip, port, currentTime{code} If you generate ids sequentially one after another, currentTime should be different. It can be replaced by nano-time if id generation is done in different threads. # You can also add to storageID an attribute that characterizes the disk volume or the directory as a new component. Examples of the new attribute could be disk serial number, or the storage directory inode number. It seems that introduction of UUIDs was unnecessary, unless of course I missed some context. Enable support for heterogeneous storages in HDFS - Key: HDFS-2832 URL: https://issues.apache.org/jira/browse/HDFS-2832 Project: Hadoop HDFS Issue Type: New Feature Affects Versions: 0.24.0 Reporter: Suresh Srinivas Assignee: Suresh Srinivas Attachments: 20130813-HeterogeneousStorage.pdf, h2832_20131023.patch, h2832_20131023b.patch, h2832_20131025.patch, h2832_20131028.patch, h2832_20131028b.patch, h2832_20131029.patch, h2832_20131103.patch HDFS currently supports configuration where storages are a list of directories. Typically each of these directories correspond to a volume with its own file system. All these directories are homogeneous and therefore identified as a single storage at the namenode. I propose, change to the current model where Datanode * is a * storage, to Datanode * is a collection * of strorages. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HDFS-2832) Enable support for heterogeneous storages in HDFS
[ https://issues.apache.org/jira/browse/HDFS-2832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13813539#comment-13813539 ] Arpit Agarwal commented on HDFS-2832: - Konstantin, {quote} I thought that the storageID approach that each DN is able to generate a unique id independently of the others is a good feature to retain. {quote} Storage (UU)IDs are independently generated on the Datanode in {{DataStorage#format}}. {quote} UUID as you noted is not unique and needs to be coordinated through NameNode. {quote} Not true. {{UUID#randomUUID}} generates RFC-4122 compliant UUIDs which are unique for all practical purposes without NameNode coordination. {quote} You can also add to storageID an attribute that characterizes the disk volume or the directory as a new component. Examples of the new attribute could be disk serial number, or the storage directory inode number. It seems that introduction of UUIDs was unnecessary, unless of course I missed some context. {quote} Part of the rationale is in HDFS-5115. Making them UUIDs simplifies the generation logic. Decoupling them from volume/directory characteristics allows future storage media that do not have a disk serial number or inode number. Enable support for heterogeneous storages in HDFS - Key: HDFS-2832 URL: https://issues.apache.org/jira/browse/HDFS-2832 Project: Hadoop HDFS Issue Type: New Feature Affects Versions: 0.24.0 Reporter: Suresh Srinivas Assignee: Suresh Srinivas Attachments: 20130813-HeterogeneousStorage.pdf, h2832_20131023.patch, h2832_20131023b.patch, h2832_20131025.patch, h2832_20131028.patch, h2832_20131028b.patch, h2832_20131029.patch, h2832_20131103.patch, h2832_20131104.patch HDFS currently supports configuration where storages are a list of directories. Typically each of these directories correspond to a volume with its own file system. All these directories are homogeneous and therefore identified as a single storage at the namenode. I propose, change to the current model where Datanode * is a * storage, to Datanode * is a collection * of strorages. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HDFS-2832) Enable support for heterogeneous storages in HDFS
[ https://issues.apache.org/jira/browse/HDFS-2832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13808115#comment-13808115 ] Arpit Agarwal commented on HDFS-2832: - We need to merge with HDFS-4949 changes that got integrated into trunk last night. Looking at it. Enable support for heterogeneous storages in HDFS - Key: HDFS-2832 URL: https://issues.apache.org/jira/browse/HDFS-2832 Project: Hadoop HDFS Issue Type: New Feature Affects Versions: 0.24.0 Reporter: Suresh Srinivas Assignee: Suresh Srinivas Attachments: 20130813-HeterogeneousStorage.pdf, h2832_20131023b.patch, h2832_20131023.patch, h2832_20131025.patch, h2832_20131028b.patch, h2832_20131028.patch HDFS currently supports configuration where storages are a list of directories. Typically each of these directories correspond to a volume with its own file system. All these directories are homogeneous and therefore identified as a single storage at the namenode. I propose, change to the current model where Datanode * is a * storage, to Datanode * is a collection * of strorages. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HDFS-2832) Enable support for heterogeneous storages in HDFS
[ https://issues.apache.org/jira/browse/HDFS-2832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13807681#comment-13807681 ] Hadoop QA commented on HDFS-2832: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12610761/h2832_20131028b.patch against trunk revision . {color:red}-1 patch{color}. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/5302//console This message is automatically generated. Enable support for heterogeneous storages in HDFS - Key: HDFS-2832 URL: https://issues.apache.org/jira/browse/HDFS-2832 Project: Hadoop HDFS Issue Type: New Feature Affects Versions: 0.24.0 Reporter: Suresh Srinivas Assignee: Suresh Srinivas Attachments: 20130813-HeterogeneousStorage.pdf, h2832_20131023b.patch, h2832_20131023.patch, h2832_20131025.patch, h2832_20131028b.patch, h2832_20131028.patch HDFS currently supports configuration where storages are a list of directories. Typically each of these directories correspond to a volume with its own file system. All these directories are homogeneous and therefore identified as a single storage at the namenode. I propose, change to the current model where Datanode * is a * storage, to Datanode * is a collection * of strorages. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HDFS-2832) Enable support for heterogeneous storages in HDFS
[ https://issues.apache.org/jira/browse/HDFS-2832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13805489#comment-13805489 ] Hadoop QA commented on HDFS-2832: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12610345/h5417_20131025.patch against trunk revision . {color:red}-1 patch{color}. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/5278//console This message is automatically generated. Enable support for heterogeneous storages in HDFS - Key: HDFS-2832 URL: https://issues.apache.org/jira/browse/HDFS-2832 Project: Hadoop HDFS Issue Type: New Feature Affects Versions: 0.24.0 Reporter: Suresh Srinivas Assignee: Suresh Srinivas Attachments: 20130813-HeterogeneousStorage.pdf, h2832_20131023b.patch, h2832_20131023.patch, h5417_20131025.patch HDFS currently supports configuration where storages are a list of directories. Typically each of these directories correspond to a volume with its own file system. All these directories are homogeneous and therefore identified as a single storage at the namenode. I propose, change to the current model where Datanode * is a * storage, to Datanode * is a collection * of strorages. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HDFS-2832) Enable support for heterogeneous storages in HDFS
[ https://issues.apache.org/jira/browse/HDFS-2832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13805492#comment-13805492 ] Hadoop QA commented on HDFS-2832: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12610345/h5417_20131025.patch against trunk revision . {color:red}-1 patch{color}. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/5279//console This message is automatically generated. Enable support for heterogeneous storages in HDFS - Key: HDFS-2832 URL: https://issues.apache.org/jira/browse/HDFS-2832 Project: Hadoop HDFS Issue Type: New Feature Affects Versions: 0.24.0 Reporter: Suresh Srinivas Assignee: Suresh Srinivas Attachments: 20130813-HeterogeneousStorage.pdf, h2832_20131023b.patch, h2832_20131023.patch, h5417_20131025.patch HDFS currently supports configuration where storages are a list of directories. Typically each of these directories correspond to a volume with its own file system. All these directories are homogeneous and therefore identified as a single storage at the namenode. I propose, change to the current model where Datanode * is a * storage, to Datanode * is a collection * of strorages. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HDFS-2832) Enable support for heterogeneous storages in HDFS
[ https://issues.apache.org/jira/browse/HDFS-2832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13803170#comment-13803170 ] Hadoop QA commented on HDFS-2832: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12609899/h2832_20131023.patch against trunk revision . {color:red}-1 patch{color}. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/5267//console This message is automatically generated. Enable support for heterogeneous storages in HDFS - Key: HDFS-2832 URL: https://issues.apache.org/jira/browse/HDFS-2832 Project: Hadoop HDFS Issue Type: New Feature Affects Versions: 0.24.0 Reporter: Suresh Srinivas Assignee: Suresh Srinivas Attachments: 20130813-HeterogeneousStorage.pdf, h2832_20131023.patch HDFS currently supports configuration where storages are a list of directories. Typically each of these directories correspond to a volume with its own file system. All these directories are homogeneous and therefore identified as a single storage at the namenode. I propose, change to the current model where Datanode * is a * storage, to Datanode * is a collection * of strorages. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HDFS-2832) Enable support for heterogeneous storages in HDFS
[ https://issues.apache.org/jira/browse/HDFS-2832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13760509#comment-13760509 ] Arpit Agarwal commented on HDFS-2832: - Thanks for the feedback Eric. Both of these would be good to have and came up during design discussions but we have not addressed either. For #2, in addition to your points there are other locations where storage directories are assumed to be File-addressable. I am not sure of the amount of work involved here. Multiple replicas per Datanode looks easier and can be done on top of the Heterogeneous Storage work. We would need phase 1 of the feature to support multiple storages. Enable support for heterogeneous storages in HDFS - Key: HDFS-2832 URL: https://issues.apache.org/jira/browse/HDFS-2832 Project: Hadoop HDFS Issue Type: New Feature Affects Versions: 0.24.0 Reporter: Suresh Srinivas Assignee: Suresh Srinivas Attachments: 20130813-HeterogeneousStorage.pdf HDFS currently supports configuration where storages are a list of directories. Typically each of these directories correspond to a volume with its own file system. All these directories are homogeneous and therefore identified as a single storage at the namenode. I propose, change to the current model where Datanode * is a * storage, to Datanode * is a collection * of strorages. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2832) Enable support for heterogeneous storages in HDFS
[ https://issues.apache.org/jira/browse/HDFS-2832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13759132#comment-13759132 ] Eric Sirianni commented on HDFS-2832: - Arpit and team - nice document. A few questions and comments: # Does the design allow a _single_ datanode to have multiple replicas of a block (presumably each on different storage types - e.g. SSD and HDD)? If so _(and I think it should)_, this would seem to require some refactoring of the {{FSDatasetInterface}} which is oriented around the fact that a block maps to a single volume (e.g. {{FSVolume getVolume(Block b)}}). # Does the design intend to allow for Storage types (e.g. RAM) to be backed by non-file-addressable stores? If so _(and I think it should)_, this would also require some redesign of some areas: #* The {{FSDatasetInterface}} abstraction which allows for pluggable (i.e. non-file-addressable) block storage mechanisms is global to the DataNode. Perhaps it should be pluggable on a _per-storage_ basis - e.g. having a {{MemoryFSDataset}} and a {{FileFSDataset}} implementation co-existing within a single DataNode instance. Thinking about this some more, this might also help address my point above re: {{FSDatasetInterface}} being oriented around a single {{FSVolume}} per block. #* Areas where File-addressable block access is assumed outside the {{FSDatasetInterface}} abstraction: #** {{BlockSender}} / {{BlockReceiver}} which downcast to FileInputStreams to obtain the underlying FD #** {{DataStorage.linkBlocks}} which uses hardlinks for upgrade/revert scenarios Enable support for heterogeneous storages in HDFS - Key: HDFS-2832 URL: https://issues.apache.org/jira/browse/HDFS-2832 Project: Hadoop HDFS Issue Type: New Feature Affects Versions: 0.24.0 Reporter: Suresh Srinivas Assignee: Suresh Srinivas Attachments: 20130813-HeterogeneousStorage.pdf HDFS currently supports configuration where storages are a list of directories. Typically each of these directories correspond to a volume with its own file system. All these directories are homogeneous and therefore identified as a single storage at the namenode. I propose, change to the current model where Datanode * is a * storage, to Datanode * is a collection * of strorages. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2832) Enable support for heterogeneous storages in HDFS
[ https://issues.apache.org/jira/browse/HDFS-2832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13751056#comment-13751056 ] Arpit Agarwal commented on HDFS-2832: - Andrew, {quote} Grep for this line: Storage preferences could be specified per-file or per-directory. {quote} I meant to say we considered both options but chose to go with per-file preferences. I should have worded it better. :-) {quote} Is this a stream API? The doc only mentions specifying storage preferences at create time via DFSClient#create. {quote} This is briefly mentioned in section 6.4 - _File Attribute APIs to query, set and clear file attributes which will be used to modify Storage Preferences_. We have describe File Attributes in detail before we start working on the feature API. {quote} I'm wondering how this works for the case where I write to SSD, run out of capacity, then want write the rest of my file to HDD. Do I need to close the file, modify the Storage Preference, then reopen for append? This also potentially requires migrating the last block to HDD, since storage types are tracked per-block, and then you might hit the HBase keeps fds open forever issue. {quote} We differentiate between running out of capacity and running out of quota (7.1, 1b and 1c). HDFS handles _Out of Capacity_ transparently by allocating subsequent blocks on HDD as fallback and does not require any migration to make forward progress. Enable support for heterogeneous storages in HDFS - Key: HDFS-2832 URL: https://issues.apache.org/jira/browse/HDFS-2832 Project: Hadoop HDFS Issue Type: New Feature Affects Versions: 0.24.0 Reporter: Suresh Srinivas Assignee: Suresh Srinivas Attachments: 20130813-HeterogeneousStorage.pdf HDFS currently supports configuration where storages are a list of directories. Typically each of these directories correspond to a volume with its own file system. All these directories are homogeneous and therefore identified as a single storage at the namenode. I propose, change to the current model where Datanode * is a * storage, to Datanode * is a collection * of strorages. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2832) Enable support for heterogeneous storages in HDFS
[ https://issues.apache.org/jira/browse/HDFS-2832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13751064#comment-13751064 ] Arpit Agarwal commented on HDFS-2832: - Hi [~djp], thanks for looking at the doc. {quote} My additional comments is: we should consider the case that remote storage that fall into the same failure group. i.e. Storage Volumes of Node A, Node B is on NAS-a, so only 1 replica should be placed on these two nodes although they may on different racks. {quote} We have not considered this use case. Are you running multiple DataNodes over the same NAS for redundancy? Please feel free to file a Jira with your feature idea and motivation and we can discuss how to proceed after we have initial level of support for Heterogeneous Storages. Sound good? Enable support for heterogeneous storages in HDFS - Key: HDFS-2832 URL: https://issues.apache.org/jira/browse/HDFS-2832 Project: Hadoop HDFS Issue Type: New Feature Affects Versions: 0.24.0 Reporter: Suresh Srinivas Assignee: Suresh Srinivas Attachments: 20130813-HeterogeneousStorage.pdf HDFS currently supports configuration where storages are a list of directories. Typically each of these directories correspond to a volume with its own file system. All these directories are homogeneous and therefore identified as a single storage at the namenode. I propose, change to the current model where Datanode * is a * storage, to Datanode * is a collection * of strorages. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2832) Enable support for heterogeneous storages in HDFS
[ https://issues.apache.org/jira/browse/HDFS-2832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13751195#comment-13751195 ] Junping Du commented on HDFS-2832: -- Thanks for quickly reply. [~arpitagarwal]. bq. We have not considered this use case. Are you running multiple DataNodes over the same NAS for redundancy? I have two use cases in my mind: 1. As one kind of DR solution, user can choose to put 1 or 2 replica on a remote reliable storage (i.e. NAS backed with SAN). Multiple DNs connect to NAS will have more bandwidth. 2. In virtualization case, some virtual machines are backed with cluster FS (i.e. VMFS) on shared storage rather than local disks (it may not be the most cost-effective way but not corner case in enterprise environment). Some DNs on VMs could be backed with same shared storage. bq. Please feel free to file a Jira with your feature idea and motivation and we can discuss how to proceed after we have initial level of support for Heterogeneous Storages. Sound good? Yes. That make sense. Will file JIRA later. Enable support for heterogeneous storages in HDFS - Key: HDFS-2832 URL: https://issues.apache.org/jira/browse/HDFS-2832 Project: Hadoop HDFS Issue Type: New Feature Affects Versions: 0.24.0 Reporter: Suresh Srinivas Assignee: Suresh Srinivas Attachments: 20130813-HeterogeneousStorage.pdf HDFS currently supports configuration where storages are a list of directories. Typically each of these directories correspond to a volume with its own file system. All these directories are homogeneous and therefore identified as a single storage at the namenode. I propose, change to the current model where Datanode * is a * storage, to Datanode * is a collection * of strorages. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2832) Enable support for heterogeneous storages in HDFS
[ https://issues.apache.org/jira/browse/HDFS-2832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13751893#comment-13751893 ] Todd Lipcon commented on HDFS-2832: --- Just read through the design doc. Thanks for writing it up. One thing I noticed under tooling that is not covered is extensions to the fs -du and/or fs -count command. Given that SSD is typically a very constrained resource, I imagine that administrators will want to be able to see where the space usage is going on a directory hierarchy level, in order to chase down users who are using too much (or applications which accidentally are specifying all their files should be on SSD). Similarly, what's the planned way in which existing FsShell commands that write data will be extended to specify these policies? For example, if I want to upload a file with hadoop fs -put, how can I specify that it should go on SSD? Maybe I missed it, but are there other CLI extensions planned in order to retrieve and set the file attributes necessary to migrate data? Enable support for heterogeneous storages in HDFS - Key: HDFS-2832 URL: https://issues.apache.org/jira/browse/HDFS-2832 Project: Hadoop HDFS Issue Type: New Feature Affects Versions: 0.24.0 Reporter: Suresh Srinivas Assignee: Suresh Srinivas Attachments: 20130813-HeterogeneousStorage.pdf HDFS currently supports configuration where storages are a list of directories. Typically each of these directories correspond to a volume with its own file system. All these directories are homogeneous and therefore identified as a single storage at the namenode. I propose, change to the current model where Datanode * is a * storage, to Datanode * is a collection * of strorages. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2832) Enable support for heterogeneous storages in HDFS
[ https://issues.apache.org/jira/browse/HDFS-2832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13752133#comment-13752133 ] Arpit Agarwal commented on HDFS-2832: - Todd, thanks for reviewing and the feedback. FsShell extensions to query and set storage policies are mentioned in sec 8,2 and agreed about the remaining two also. I'll mention them when I rev the design doc. Enable support for heterogeneous storages in HDFS - Key: HDFS-2832 URL: https://issues.apache.org/jira/browse/HDFS-2832 Project: Hadoop HDFS Issue Type: New Feature Affects Versions: 0.24.0 Reporter: Suresh Srinivas Assignee: Suresh Srinivas Attachments: 20130813-HeterogeneousStorage.pdf HDFS currently supports configuration where storages are a list of directories. Typically each of these directories correspond to a volume with its own file system. All these directories are homogeneous and therefore identified as a single storage at the namenode. I propose, change to the current model where Datanode * is a * storage, to Datanode * is a collection * of strorages. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2832) Enable support for heterogeneous storages in HDFS
[ https://issues.apache.org/jira/browse/HDFS-2832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13750715#comment-13750715 ] Arpit Agarwal commented on HDFS-2832: - Hi Andrew, Thanks for looking at doc, great questions. {quote} - For quota management, have you considered the YARN-like abstraction of users and pools? We're moving down that avenue in HDFS-4949, and it'd be nice to eventually have a single abstraction if we can. I get that for a first cut, it's easier to stick with the existing disk quota system. {quote} I am interested in seeing how the pools abstraction is defined and whether it can cover all our use cases. Do you have a design doc? We chose this approach because it extends the existing quota system and APIs and more importantly covers our use cases. {quote} - How do you expect applications to handle runtime failures? If I have a stream open and my write fails due to lack of SSD quota, can I change it to retry the write to HDD? Do I get metrics so I can alert somewhere? {quote} Out of quota is a hard failure just like hitting the disk space quota limit. The application must change the Storage Preference on the file to continue. We have not discussed metrics yet. {quote} - How do you handle block migration of files opened by long-lived applications like HBase that also use short-circuit local reads? Let's say HBase initially writes all its files to SSD, then we want to periodically migrate them to HDD. HBase holds onto the SSD file descriptors indefinitely, preventing reclamation of SSD capacity. {quote} Quota should be blocked indefinitely until the files can be moved off their current Storage Type. We did not cover this use case, so thanks for calling it out! I will make the update. {quote} - If File Storage Preferences are part of file metadata, what happens when the files are copied, or distcp'd to another cluster? {quote} I think this is a tools decision. We probably want to lose the File Attributes, with an option to preserve them. We will document this in more detail when we get to updating the tools. {quote} - Why do we want a default Storage Preferences specified on a directory? I'd actually prefer if we make applications explicitly request special treatment when they open a stream. {quote} Storage Preferences are not supported on directories. Please let me know if you see anything in the doc implying otherwise and I will fix it. {quote} - Let's say I'm a cluster operator, and have nodes with both PCI-e and SATA SSDs. Can I differentiate between them? How about if I add nodes with an unknown StorageType like NVRAM? Basically: what's required to add a new StorageType? - Also related, when I bring up a new StorageType in my cluster, how do I make my applications start using it? Do I need to submit patches to HBase to now know how to use NVRAM properly? This seems like one of the downsides of physical storage types, logical means apps can do this more automatically. {quote} Adding a new StorageType needs code and update to the StorageType enum. We made the trade-off for API and implementation simplicity for v1 but we are not ruling out adding support for logical classification in the future. Enable support for heterogeneous storages in HDFS - Key: HDFS-2832 URL: https://issues.apache.org/jira/browse/HDFS-2832 Project: Hadoop HDFS Issue Type: New Feature Affects Versions: 0.24.0 Reporter: Suresh Srinivas Assignee: Suresh Srinivas Attachments: 20130813-HeterogeneousStorage.pdf HDFS currently supports configuration where storages are a list of directories. Typically each of these directories correspond to a volume with its own file system. All these directories are homogeneous and therefore identified as a single storage at the namenode. I propose, change to the current model where Datanode * is a * storage, to Datanode * is a collection * of strorages. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2832) Enable support for heterogeneous storages in HDFS
[ https://issues.apache.org/jira/browse/HDFS-2832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13750751#comment-13750751 ] Andrew Wang commented on HDFS-2832: --- bq. I am interested in seeing how the pools abstraction is defined... I've got a little writeup in the HDFS-4949 design doc, but we're still figuring it out. Arun expressed interest in reviewing our RM plan, so you could ask him about pools/etc too. bq. The application must change the Storage Preference on the file to continue. Is this a stream API? The doc only mentions specifying storage preferences at create time via {{DFSClient#create}}. I'm wondering how this works for the case where I write to SSD, run out of capacity, then want write the rest of my file to HDD. Do I need to close the file, modify the Storage Preference, then reopen for append? This also potentially requires migrating the last block to HDD, since storage types are tracked per-block, and then you might hit the HBase keeps fds open forever issue. It'd be nice to see something in the doc discussing the above, under construction replicas are hard :) bq. Storage Preferences are not supported on directories. Please let me know if you see anything in the doc implying otherwise and I will fix it. Grep for this line: Storage preferences could be specified per-file or per-directory. Enable support for heterogeneous storages in HDFS - Key: HDFS-2832 URL: https://issues.apache.org/jira/browse/HDFS-2832 Project: Hadoop HDFS Issue Type: New Feature Affects Versions: 0.24.0 Reporter: Suresh Srinivas Assignee: Suresh Srinivas Attachments: 20130813-HeterogeneousStorage.pdf HDFS currently supports configuration where storages are a list of directories. Typically each of these directories correspond to a volume with its own file system. All these directories are homogeneous and therefore identified as a single storage at the namenode. I propose, change to the current model where Datanode * is a * storage, to Datanode * is a collection * of strorages. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2832) Enable support for heterogeneous storages in HDFS
[ https://issues.apache.org/jira/browse/HDFS-2832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13749510#comment-13749510 ] Junping Du commented on HDFS-2832: -- Nice doc. My additional comments is: we should consider the case that remote storage that fall into the same failure group. i.e. Storage Volumes of Node A, Node B is on NAS-a, so only 1 replica should be placed on these two nodes although they may on different racks. Enable support for heterogeneous storages in HDFS - Key: HDFS-2832 URL: https://issues.apache.org/jira/browse/HDFS-2832 Project: Hadoop HDFS Issue Type: New Feature Affects Versions: 0.24.0 Reporter: Suresh Srinivas Assignee: Suresh Srinivas Attachments: 20130813-HeterogeneousStorage.pdf HDFS currently supports configuration where storages are a list of directories. Typically each of these directories correspond to a volume with its own file system. All these directories are homogeneous and therefore identified as a single storage at the namenode. I propose, change to the current model where Datanode * is a * storage, to Datanode * is a collection * of strorages. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2832) Enable support for heterogeneous storages in HDFS
[ https://issues.apache.org/jira/browse/HDFS-2832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13748975#comment-13748975 ] Andrew Wang commented on HDFS-2832: --- I finally got a chance to read this doc, nice work. A few questions: - For quota management, have you considered the YARN-like abstraction of users and pools? We're moving down that avenue in HDFS-4949, and it'd be nice to eventually have a single abstraction if we can. I get that for a first cut, it's easier to stick with the existing disk quota system. - How do you expect applications to handle runtime failures? If I have a stream open and my write fails due to lack of SSD quota, can I change it to retry the write to HDD? Do I get metrics so I can alert somewhere? - How do you handle block migration of files opened by long-lived applications like HBase that also use short-circuit local reads? Let's say HBase initially writes all its files to SSD, then we want to periodically migrate them to HDD. HBase holds onto the SSD file descriptors indefinitely, preventing reclamation of SSD capacity. - If File Storage Preferences are part of file metadata, what happens when the files are copied, or distcp'd to another cluster? - Why do we want a default Storage Preferences specified on a directory? I'd actually prefer if we make applications explicitly request special treatment when they open a stream. - Let's say I'm a cluster operator, and have nodes with both PCI-e and SATA SSDs. Can I differentiate between them? How about if I add nodes with an unknown StorageType like NVRAM? Basically: what's required to add a new StorageType? - Also related, when I bring up a new StorageType in my cluster, how do I make my applications start using it? Do I need to submit patches to HBase to now know how to use NVRAM properly? This seems like one of the downsides of physical storage types, logical means apps can do this more automatically. Enable support for heterogeneous storages in HDFS - Key: HDFS-2832 URL: https://issues.apache.org/jira/browse/HDFS-2832 Project: Hadoop HDFS Issue Type: New Feature Affects Versions: 0.24.0 Reporter: Suresh Srinivas Assignee: Suresh Srinivas Attachments: 20130813-HeterogeneousStorage.pdf HDFS currently supports configuration where storages are a list of directories. Typically each of these directories correspond to a volume with its own file system. All these directories are homogeneous and therefore identified as a single storage at the namenode. I propose, change to the current model where Datanode * is a * storage, to Datanode * is a collection * of strorages. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2832) Enable support for heterogeneous storages in HDFS
[ https://issues.apache.org/jira/browse/HDFS-2832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13741504#comment-13741504 ] Bikas Saha commented on HDFS-2832: -- Nice doc! If there are multiple hard drives on a data node will they be represented as a single storage volume of type HDD. Will it be legal to have multiple storage volumes of the same type on the same data node? Say 1 volume per HDD. Enable support for heterogeneous storages in HDFS - Key: HDFS-2832 URL: https://issues.apache.org/jira/browse/HDFS-2832 Project: Hadoop HDFS Issue Type: New Feature Affects Versions: 0.24.0 Reporter: Suresh Srinivas Assignee: Suresh Srinivas Attachments: 20130813-HeterogeneousStorage.pdf HDFS currently supports configuration where storages are a list of directories. Typically each of these directories correspond to a volume with its own file system. All these directories are homogeneous and therefore identified as a single storage at the namenode. I propose, change to the current model where Datanode * is a * storage, to Datanode * is a collection * of strorages. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2832) Enable support for heterogeneous storages in HDFS
[ https://issues.apache.org/jira/browse/HDFS-2832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13741519#comment-13741519 ] Arpit Agarwal commented on HDFS-2832: - Hi Bikas, each storage directory/volume (we used the terms interchangeably) will be represented as a distinct DataNodeStorage. They will not be grouped by types. Enable support for heterogeneous storages in HDFS - Key: HDFS-2832 URL: https://issues.apache.org/jira/browse/HDFS-2832 Project: Hadoop HDFS Issue Type: New Feature Affects Versions: 0.24.0 Reporter: Suresh Srinivas Assignee: Suresh Srinivas Attachments: 20130813-HeterogeneousStorage.pdf HDFS currently supports configuration where storages are a list of directories. Typically each of these directories correspond to a volume with its own file system. All these directories are homogeneous and therefore identified as a single storage at the namenode. I propose, change to the current model where Datanode * is a * storage, to Datanode * is a collection * of strorages. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2832) Enable support for heterogeneous storages in HDFS
[ https://issues.apache.org/jira/browse/HDFS-2832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13741557#comment-13741557 ] Bikas Saha commented on HDFS-2832: -- Sorry I am still not clear on this. Let me re-phrase. If there are 10 disks on a data node, will it be legal to create 2 volumes of type HDD on that Datanode with 5 disks each? I am guessing that the volume+type list for a datanode will come from config. i.e. the datanode will not be figuring this out automatically by inspecting the hardware on the machine. Enable support for heterogeneous storages in HDFS - Key: HDFS-2832 URL: https://issues.apache.org/jira/browse/HDFS-2832 Project: Hadoop HDFS Issue Type: New Feature Affects Versions: 0.24.0 Reporter: Suresh Srinivas Assignee: Suresh Srinivas Attachments: 20130813-HeterogeneousStorage.pdf HDFS currently supports configuration where storages are a list of directories. Typically each of these directories correspond to a volume with its own file system. All these directories are homogeneous and therefore identified as a single storage at the namenode. I propose, change to the current model where Datanode * is a * storage, to Datanode * is a collection * of strorages. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2832) Enable support for heterogeneous storages in HDFS
[ https://issues.apache.org/jira/browse/HDFS-2832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13741568#comment-13741568 ] Arpit Agarwal commented on HDFS-2832: - Yes, directory+type will be read from the configuration (sec 5.2.1). A volume corresponds to a single storage directory, so if you have 10 unstriped disks there will be 10 at least storage volumes. Enable support for heterogeneous storages in HDFS - Key: HDFS-2832 URL: https://issues.apache.org/jira/browse/HDFS-2832 Project: Hadoop HDFS Issue Type: New Feature Affects Versions: 0.24.0 Reporter: Suresh Srinivas Assignee: Suresh Srinivas Attachments: 20130813-HeterogeneousStorage.pdf HDFS currently supports configuration where storages are a list of directories. Typically each of these directories correspond to a volume with its own file system. All these directories are homogeneous and therefore identified as a single storage at the namenode. I propose, change to the current model where Datanode * is a * storage, to Datanode * is a collection * of strorages. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2832) Enable support for heterogeneous storages in HDFS
[ https://issues.apache.org/jira/browse/HDFS-2832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13739734#comment-13739734 ] John George commented on HDFS-2832: --- To support Storage Types, each DataNode must be treated as a collection of storages. (excerpt from pdf) Consider a cluster with a set of DataNodes with high end hardware (eg: SSD), and another set of DataNodes with low end hardware (eg: HDD). Each datanode is homogenous by itself, but the cluster itself is heterogeneous. Can the user still specify storage preference using StorageType and get expected results? Enable support for heterogeneous storages in HDFS - Key: HDFS-2832 URL: https://issues.apache.org/jira/browse/HDFS-2832 Project: Hadoop HDFS Issue Type: New Feature Affects Versions: 0.24.0 Reporter: Suresh Srinivas Assignee: Suresh Srinivas Attachments: 20130813-HeterogeneousStorage.pdf HDFS currently supports configuration where storages are a list of directories. Typically each of these directories correspond to a volume with its own file system. All these directories are homogeneous and therefore identified as a single storage at the namenode. I propose, change to the current model where Datanode * is a * storage, to Datanode * is a collection * of strorages. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2832) Enable support for heterogeneous storages in HDFS
[ https://issues.apache.org/jira/browse/HDFS-2832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13739884#comment-13739884 ] Arpit Agarwal commented on HDFS-2832: - John, {quote}Can the user still specify storage preference using StorageType and get expected results? {quote} We don't make any assumptions about the cluster layout. The storages attached to a DataNode may be of the same of different types. Enable support for heterogeneous storages in HDFS - Key: HDFS-2832 URL: https://issues.apache.org/jira/browse/HDFS-2832 Project: Hadoop HDFS Issue Type: New Feature Affects Versions: 0.24.0 Reporter: Suresh Srinivas Assignee: Suresh Srinivas Attachments: 20130813-HeterogeneousStorage.pdf HDFS currently supports configuration where storages are a list of directories. Typically each of these directories correspond to a volume with its own file system. All these directories are homogeneous and therefore identified as a single storage at the namenode. I propose, change to the current model where Datanode * is a * storage, to Datanode * is a collection * of strorages. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2832) Enable support for heterogeneous storages in HDFS
[ https://issues.apache.org/jira/browse/HDFS-2832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13739170#comment-13739170 ] Tsz Wo (Nicholas), SZE commented on HDFS-2832: -- Just have created a new branch: http://svn.apache.org/viewvc/hadoop/common/branches/HDFS-2832/ Enable support for heterogeneous storages in HDFS - Key: HDFS-2832 URL: https://issues.apache.org/jira/browse/HDFS-2832 Project: Hadoop HDFS Issue Type: New Feature Affects Versions: 0.24.0 Reporter: Suresh Srinivas Assignee: Suresh Srinivas Attachments: 20130813-HeterogeneousStorage.pdf HDFS currently supports configuration where storages are a list of directories. Typically each of these directories correspond to a volume with its own file system. All these directories are homogeneous and therefore identified as a single storage at the namenode. I propose, change to the current model where Datanode * is a * storage, to Datanode * is a collection * of strorages. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2832) Enable support for heterogeneous storages in HDFS
[ https://issues.apache.org/jira/browse/HDFS-2832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13718853#comment-13718853 ] Andrew Wang commented on HDFS-2832: --- Hey Arpit, like Bikas I'd also love to see a design doc. I'm also potentially interested in picking up some of the outstanding subtasks, since they seem very related to HDFS-4949. Enable support for heterogeneous storages in HDFS - Key: HDFS-2832 URL: https://issues.apache.org/jira/browse/HDFS-2832 Project: Hadoop HDFS Issue Type: New Feature Affects Versions: 0.24.0 Reporter: Suresh Srinivas Assignee: Suresh Srinivas HDFS currently supports configuration where storages are a list of directories. Typically each of these directories correspond to a volume with its own file system. All these directories are homogeneous and therefore identified as a single storage at the namenode. I propose, change to the current model where Datanode * is a * storage, to Datanode * is a collection * of strorages. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2832) Enable support for heterogeneous storages in HDFS
[ https://issues.apache.org/jira/browse/HDFS-2832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13718865#comment-13718865 ] Suresh Srinivas commented on HDFS-2832: --- There may not be lot of overlap between this and HDFS-4949, other than indicating storage type as memory in block report. That said you are welcome to pickup the tasks. Enable support for heterogeneous storages in HDFS - Key: HDFS-2832 URL: https://issues.apache.org/jira/browse/HDFS-2832 Project: Hadoop HDFS Issue Type: New Feature Affects Versions: 0.24.0 Reporter: Suresh Srinivas Assignee: Suresh Srinivas HDFS currently supports configuration where storages are a list of directories. Typically each of these directories correspond to a volume with its own file system. All these directories are homogeneous and therefore identified as a single storage at the namenode. I propose, change to the current model where Datanode * is a * storage, to Datanode * is a collection * of strorages. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2832) Enable support for heterogeneous storages in HDFS
[ https://issues.apache.org/jira/browse/HDFS-2832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13718951#comment-13718951 ] Arpit Agarwal commented on HDFS-2832: - Hi [~andrew.wang], that's great to hear. I got sidetracked into some retry cache related changes for 2.1.0-beta. I will post the doc next week. Enable support for heterogeneous storages in HDFS - Key: HDFS-2832 URL: https://issues.apache.org/jira/browse/HDFS-2832 Project: Hadoop HDFS Issue Type: New Feature Affects Versions: 0.24.0 Reporter: Suresh Srinivas Assignee: Suresh Srinivas HDFS currently supports configuration where storages are a list of directories. Typically each of these directories correspond to a volume with its own file system. All these directories are homogeneous and therefore identified as a single storage at the namenode. I propose, change to the current model where Datanode * is a * storage, to Datanode * is a collection * of strorages. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2832) Enable support for heterogeneous storages in HDFS
[ https://issues.apache.org/jira/browse/HDFS-2832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13711356#comment-13711356 ] Bikas Saha commented on HDFS-2832: -- Is there an overall design document that one can follow? HDFS-2802 was great in this regard with initial docs followed by revisions. Enable support for heterogeneous storages in HDFS - Key: HDFS-2832 URL: https://issues.apache.org/jira/browse/HDFS-2832 Project: Hadoop HDFS Issue Type: New Feature Affects Versions: 0.24.0 Reporter: Suresh Srinivas Assignee: Suresh Srinivas HDFS currently supports configuration where storages are a list of directories. Typically each of these directories correspond to a volume with its own file system. All these directories are homogeneous and therefore identified as a single storage at the namenode. I propose, change to the current model where Datanode * is a * storage, to Datanode * is a collection * of strorages. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2832) Enable support for heterogeneous storages in HDFS
[ https://issues.apache.org/jira/browse/HDFS-2832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13711362#comment-13711362 ] Andrew Purtell commented on HDFS-2832: -- [~sanjay.radia] bq. Your recent tweet about this before giving me a chance to respond is unfortunate, especially given that HDFS-4672 is overlapping with HDFS-2832. - isn't this the pot calling the kettle black? This argumentation rather than hearing those concerns, the scope increase here in the first place that ignored HDFS-4672 - and now in addition it is a duplicate, and the unanswered private communication all indicate that there is some kind of competitive dynamic here that is not helpful. I prefer to defuse the situation and move on rather than engage in some kind of debate club atmosphere. Therefore today I resolved HDFS-4672 as Later. Please see my comment at the tail of it. Enable support for heterogeneous storages in HDFS - Key: HDFS-2832 URL: https://issues.apache.org/jira/browse/HDFS-2832 Project: Hadoop HDFS Issue Type: New Feature Affects Versions: 0.24.0 Reporter: Suresh Srinivas Assignee: Suresh Srinivas HDFS currently supports configuration where storages are a list of directories. Typically each of these directories correspond to a volume with its own file system. All these directories are homogeneous and therefore identified as a single storage at the namenode. I propose, change to the current model where Datanode * is a * storage, to Datanode * is a collection * of strorages. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2832) Enable support for heterogeneous storages in HDFS
[ https://issues.apache.org/jira/browse/HDFS-2832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13711771#comment-13711771 ] Arpit Agarwal commented on HDFS-2832: - [~bikassaha] We'll post a design doc by next week. Enable support for heterogeneous storages in HDFS - Key: HDFS-2832 URL: https://issues.apache.org/jira/browse/HDFS-2832 Project: Hadoop HDFS Issue Type: New Feature Affects Versions: 0.24.0 Reporter: Suresh Srinivas Assignee: Suresh Srinivas HDFS currently supports configuration where storages are a list of directories. Typically each of these directories correspond to a volume with its own file system. All these directories are homogeneous and therefore identified as a single storage at the namenode. I propose, change to the current model where Datanode * is a * storage, to Datanode * is a collection * of strorages. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2832) Enable support for heterogeneous storages in HDFS
[ https://issues.apache.org/jira/browse/HDFS-2832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13707230#comment-13707230 ] Andrew Purtell commented on HDFS-2832: -- Doesn't the above repurposing of this JIRA overlap substantially with HDFS-4672? We already opened that JIRA to see if folks may be interested in collaborating on tiered storage support. I'm curious why that hasn't happened substantially there but there is this post on this issue today. Many of the items on the above list could have been taken from HDFS-4672. Enable support for heterogeneous storages in HDFS - Key: HDFS-2832 URL: https://issues.apache.org/jira/browse/HDFS-2832 Project: Hadoop HDFS Issue Type: New Feature Affects Versions: 0.24.0 Reporter: Suresh Srinivas Assignee: Suresh Srinivas HDFS currently supports configuration where storages are a list of directories. Typically each of these directories correspond to a volume with its own file system. All these directories are homogeneous and therefore identified as a single storage at the namenode. I propose, change to the current model where Datanode * is a * storage, to Datanode * is a collection * of strorages. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2832) Enable support for heterogeneous storages in HDFS
[ https://issues.apache.org/jira/browse/HDFS-2832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13707244#comment-13707244 ] Andrew Purtell commented on HDFS-2832: -- {quote} We will open Jiras to track these issues and folks interested in contributing can collaborate on this work. Please reach out to me. - Suresh Srinivas, Sanjay Radia and Arpit Agarwal {quote} As you know there is an email from me reaching out to you on exactly the topic of tiered storage in your inbox still waiting for a response. I know conclusively at least Suresh and Sanjay received a copy. I sent that at the invitation of one of your colleagues last month. Perhaps I can conclude from the silence that email is an incorrect way to reach out, in which case I would welcome your participation on HDFS-4762 in a truly inclusive community process of collaboration, rather than dubious invitations, based on my personal experience, here. Enable support for heterogeneous storages in HDFS - Key: HDFS-2832 URL: https://issues.apache.org/jira/browse/HDFS-2832 Project: Hadoop HDFS Issue Type: New Feature Affects Versions: 0.24.0 Reporter: Suresh Srinivas Assignee: Suresh Srinivas HDFS currently supports configuration where storages are a list of directories. Typically each of these directories correspond to a volume with its own file system. All these directories are homogeneous and therefore identified as a single storage at the namenode. I propose, change to the current model where Datanode * is a * storage, to Datanode * is a collection * of strorages. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2832) Enable support for heterogeneous storages in HDFS
[ https://issues.apache.org/jira/browse/HDFS-2832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13707388#comment-13707388 ] Nick Dimiduk commented on HDFS-2832: I wonder how these tiers interact with alternative storage services? For instance, I run my cluster in EC2 and want a replica stored on S3. Where does this use case fit in? Enable support for heterogeneous storages in HDFS - Key: HDFS-2832 URL: https://issues.apache.org/jira/browse/HDFS-2832 Project: Hadoop HDFS Issue Type: New Feature Affects Versions: 0.24.0 Reporter: Suresh Srinivas Assignee: Suresh Srinivas HDFS currently supports configuration where storages are a list of directories. Typically each of these directories correspond to a volume with its own file system. All these directories are homogeneous and therefore identified as a single storage at the namenode. I propose, change to the current model where Datanode * is a * storage, to Datanode * is a collection * of strorages. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2832) Enable support for heterogeneous storages in HDFS
[ https://issues.apache.org/jira/browse/HDFS-2832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13707418#comment-13707418 ] Suresh Srinivas commented on HDFS-2832: --- bq. Many of the items on the above list could have been taken from HDFS-4672. Wow. Please go read the description. I will make it easy by adding a link here - https://issues.apache.org/jira/browse/HDFS-2832?focusedCommentId=13192326page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13192326. The description while does not spell details, gives an idea of how far sweeping this proposal is. Some of the tasks were prioritized such as making Datanode as a collection storages because they would require significant protocol changes. After that we have slowed down. However this jira can be worked on in multiple independent phases. bq. ... I would welcome your participation on HDFS-4762 in a truly inclusive community process of collaboration, rather than dubious invitations, based on my personal experience, here. So you sending an email on a public mailing list is true inclusive community process. However, a similar comment on this jira is not?!!! I have hard time understanding this. In fact many of the features done in HDFS have had meetups with not only HDFS folks, but also higher stack components such as HBase and experienced Hadoop Ops. So please stop these kinds of comments that you have no way participating in this. Just because you create a jira with additional information, does not make it original jira. It is still a duplicate. Spirit of collaboration that you so strongly talk about could have been demonstrated by just posting a comment on this jira instead of opening a new one. In fact I had told you in person in Hadoop Summit Europe that we will start working on this in April (which we could not, given snapshot feature). I am going to create some of the sub-tasks that I have always been thinking about. I will also leave it open, for people to comment on, do code reviews, and contribute patches to. Hopefully that should help you to participate in this work, if for some reason you did not feel involved already. Again if your frustration is stemming from the fact that you already have code for some of this, we could consider them adding as part of subtasks that is being created. Enable support for heterogeneous storages in HDFS - Key: HDFS-2832 URL: https://issues.apache.org/jira/browse/HDFS-2832 Project: Hadoop HDFS Issue Type: New Feature Affects Versions: 0.24.0 Reporter: Suresh Srinivas Assignee: Suresh Srinivas HDFS currently supports configuration where storages are a list of directories. Typically each of these directories correspond to a volume with its own file system. All these directories are homogeneous and therefore identified as a single storage at the namenode. I propose, change to the current model where Datanode * is a * storage, to Datanode * is a collection * of strorages. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2832) Enable support for heterogeneous storages in HDFS
[ https://issues.apache.org/jira/browse/HDFS-2832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13707422#comment-13707422 ] Sanjay Radia commented on HDFS-2832: Andrew, I was always confused about the creation of HDFS-4672 given that HDFS-2832 existed for quite a while and with a bunch of subtasks complete. note: Konstantine asked you about the duplication in April What is the difference between this issue and HDFS-2832? in [comment | https://issues.apache.org/jira/browse/HDFS-4672?focusedCommentId=13629403page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13629403] to which you have not responded to date. However noting the description and comments on HDFS-4672 it is clear that HDFS-4672 seems to be focusing on policies for placement. I resigned to letting this Jira (HDFS-2832) focus on the raw mechanisms and HDFS-4672 focus on the policies; as a result I did not raise the issue about you duplicating this jira when you created HDFS-4672. I did that out of politeness and moving forward but perhaps that was a mistake on my part. Also note that I have commented on the other jira and have not ignored it. Your recent tweet about this before giving me a chance to respond is unfortunate, especially given that HDFS-4682 has is overlapping with HDFS-2832. - isn't this the pot calling the kettle black? Getting back to the business building Hadoop: We need to move quickly forward by adding mechanism in HDFS to allow different kinds of storage while the longer discussions on policies continue. Given that a recent Jira on ram caching (HDFS-4949) is planning to move quickly, it is important that we start by providing the the needed mechanisms. We plan to add additional mechanisms via sub-jiras which can support HDFS-4949. You are welcome to contribute with patches and discussions. Similarly we plan to continue to participate in the longer ranged discussions on policies and also contribute with patches. Enable support for heterogeneous storages in HDFS - Key: HDFS-2832 URL: https://issues.apache.org/jira/browse/HDFS-2832 Project: Hadoop HDFS Issue Type: New Feature Affects Versions: 0.24.0 Reporter: Suresh Srinivas Assignee: Suresh Srinivas HDFS currently supports configuration where storages are a list of directories. Typically each of these directories correspond to a volume with its own file system. All these directories are homogeneous and therefore identified as a single storage at the namenode. I propose, change to the current model where Datanode * is a * storage, to Datanode * is a collection * of strorages. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2832) Enable support for heterogeneous storages in HDFS
[ https://issues.apache.org/jira/browse/HDFS-2832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13707430#comment-13707430 ] Suresh Srinivas commented on HDFS-2832: --- bq. I wonder how these tiers interact with alternative storage services? For instance, I run my cluster in EC2 and want a replica stored on S3. Where does this use case fit in? Currently we are considering the following: local Disk, remote {locak rack|remote rack} disk, local SSD, remote {local rack|remote rack} SSD, local RAM, remote {local rack|remote rack} RAM, RAID, tape, remote storage (such as S3, SAN) etc. We are also considering temporary/permanent replicas etc. Will post some details next week. We can discuss if it includes all the storage types we should be considering. Enable support for heterogeneous storages in HDFS - Key: HDFS-2832 URL: https://issues.apache.org/jira/browse/HDFS-2832 Project: Hadoop HDFS Issue Type: New Feature Affects Versions: 0.24.0 Reporter: Suresh Srinivas Assignee: Suresh Srinivas HDFS currently supports configuration where storages are a list of directories. Typically each of these directories correspond to a volume with its own file system. All these directories are homogeneous and therefore identified as a single storage at the namenode. I propose, change to the current model where Datanode * is a * storage, to Datanode * is a collection * of strorages. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2832) Enable support for heterogeneous storages in HDFS
[ https://issues.apache.org/jira/browse/HDFS-2832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13707450#comment-13707450 ] Suresh Srinivas commented on HDFS-2832: --- I have created couple of jiras, thus dumping out stuff I have been carrying in my head (and other that I have discussed this with) for a long time. These set of jiras enable support for heterogeneous storages in different tiers. I would like to start working on HDFS-4985 immediately as this might also be useful for expressing datanode cache in HDFS-4949 in a way that is compatible with the proposal here. Enable support for heterogeneous storages in HDFS - Key: HDFS-2832 URL: https://issues.apache.org/jira/browse/HDFS-2832 Project: Hadoop HDFS Issue Type: New Feature Affects Versions: 0.24.0 Reporter: Suresh Srinivas Assignee: Suresh Srinivas HDFS currently supports configuration where storages are a list of directories. Typically each of these directories correspond to a volume with its own file system. All these directories are homogeneous and therefore identified as a single storage at the namenode. I propose, change to the current model where Datanode * is a * storage, to Datanode * is a collection * of strorages. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2832) Enable support for heterogeneous storages in HDFS
[ https://issues.apache.org/jira/browse/HDFS-2832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13192326#comment-13192326 ] Suresh Srinivas commented on HDFS-2832: --- Advantages: # Support for heterogeneous storages: #* DN could support along with disks, other types of storage such as flash etc. #* Suitable storage can be chosen based on client preference such as need for random reads etc. # Block report scaling: instead of a single monolithic block report, a smaller block report per storage becomes possible. This is important with the growth in disk capacity and number of disks per datanode. # Better granularity of storage failure handling: #* DN could just indicate loss of storage and namenode can handle it better since it knows the list of blocks belonging to a storage. #* DN could locally handle storage failures or provide decommissioning of a storage by marking a storage as ReadOnly. # Hot pluggability of disks/storages: adding and deleting a storage to a node is simplified. # Other flexibility: includes future enhancements to balance storages with in a datanode, balancing the load (number of transceivers) per storage etc and better block placement strategies. Backward compatibility: The existing grouping of all storages under a single storage ID is a specific case of the generalized model proposed above. This change will be backward compatible with the existing deployments. Enable support for heterogeneous storages in HDFS - Key: HDFS-2832 URL: https://issues.apache.org/jira/browse/HDFS-2832 Project: Hadoop HDFS Issue Type: New Feature Affects Versions: 0.24.0 Reporter: Suresh Srinivas Assignee: Suresh Srinivas HDFS currently supports configuration where storages are a list of directories. Typically each of these directories correspond to a volume with its own file system. All these directories are homogeneous and therefore identified as a single storage at the namenode. I propose, change to the current model where Datanode * is a * storage, to Datanode * is a collection * of strorages. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira