[ https://issues.apache.org/jira/browse/HDFS-6584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14939371#comment-14939371 ]
Yongjun Zhang commented on HDFS-6584: ------------------------------------- HI [~szetszwo], Thanks for the work you and other folks did here. I have a question: Per your comment: https://issues.apache.org/jira/browse/HDFS-6584?focusedCommentId=14139690&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14139690 https://issues.apache.org/jira/browse/HDFS-6584?focusedCommentId=14148307&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14148307 you merged the feature branch to trunk and branch-2. When I look the git log for trunk and branch-2, I can see that trunk has {code} commit 22a41dce4af4d5b533ba875b322551db1c152878 Author: Tsz-Wo Nicholas Sze <szets...@hortonworks.com> Date: Sun Sep 7 07:44:28 2014 +0800 HDFS-6997: add more tests for data migration and replicaion. {code} . However, branch-2 doesn't have it. I looked at branch-2, and saw that the HDFS-6997 code is there. I checked another subtask of HDFS-6584, and it's same situation. It looks that the commit history is collapsed during the branch-2 merge, but the commit history was kept when doing the trunk merge. Is this intended? would you please comment on what might have happened? Thanks much. > Support Archival Storage > ------------------------ > > Key: HDFS-6584 > URL: https://issues.apache.org/jira/browse/HDFS-6584 > Project: Hadoop HDFS > Issue Type: New Feature > Components: balancer & mover, namenode > Reporter: Tsz Wo Nicholas Sze > Assignee: Tsz Wo Nicholas Sze > Fix For: 2.6.0 > > Attachments: HDFS-6584.000.patch, > HDFSArchivalStorageDesign20140623.pdf, HDFSArchivalStorageDesign20140715.pdf, > archival-storage-testplan.pdf, h6584_20140907.patch, h6584_20140908.patch, > h6584_20140908b.patch, h6584_20140911.patch, h6584_20140911b.patch, > h6584_20140915.patch, h6584_20140916.patch, h6584_20140916.patch, > h6584_20140917.patch, h6584_20140917b.patch, h6584_20140918.patch, > h6584_20140918b.patch > > > In most of the Hadoop clusters, as more and more data is stored for longer > time, the demand for storage is outstripping the compute. Hadoop needs a cost > effective and easy to manage solution to meet this demand for storage. > Current solution is: > - Delete the old unused data. This comes at operational cost of identifying > unnecessary data and deleting them manually. > - Add more nodes to the clusters. This adds along with storage capacity > unnecessary compute capacity to the cluster. > Hadoop needs a solution to decouple growing storage capacity from compute > capacity. Nodes with higher density and less expensive storage with low > compute power are becoming available and can be used as cold storage in the > clusters. Based on policy the data from hot storage can be moved to cold > storage. Adding more nodes to the cold storage can grow the storage > independent of the compute capacity in the cluster. -- This message was sent by Atlassian JIRA (v6.3.4#6332)