[ https://issues.apache.org/jira/browse/HDFS-10285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15261581#comment-15261581 ]
Uma Maheswara Rao G commented on HDFS-10285: -------------------------------------------- {quote} So does this mean there would be a need to reserve and copy the inherited storage policy in distcp tool? {quote} The current implementation does not copy the source storage policy. What you mean by preserve here? Sorry I did not follow this. Could you elaborate a bit? {quote} Yeah, having an API to allow applications to trigger the mover behavior sounds good. As mentioned in the proposal, there is a need in HBase on HDFS HSM. Maybe Jingcheng Du and Wei Zhou could have detailed description about this as I know you have the relevant work. {quote} That will be great! Thanks a lot, Kai for your comments. > Storage Policy Satisfier in Namenode > ------------------------------------ > > Key: HDFS-10285 > URL: https://issues.apache.org/jira/browse/HDFS-10285 > Project: Hadoop HDFS > Issue Type: New Feature > Components: datanode, namenode > Affects Versions: 2.7.2 > Reporter: Uma Maheswara Rao G > Assignee: Uma Maheswara Rao G > > Heterogeneous storage in HDFS introduced the concept of storage policy. These > policies can be set on directory/file to specify the user preference, where > to store the physical block. When user set the storage policy before writing > data, then the blocks could take advantage of storage policy preferences and > stores physical block accordingly. > If user set the storage policy after writing and completing the file, then > the blocks would have been written with default storage policy (nothing but > DISK). User has to run the ‘Mover tool’ explicitly by specifying all such > file names as a list. In some distributed system scenarios (ex: HBase) it > would be difficult to collect all the files and run the tool as different > nodes can write files separately and file can have different paths. > Another scenarios is, when user rename the files from one effected storage > policy file (inherited policy from parent directory) to another storage > policy effected directory, it will not copy inherited storage policy from > source. So it will take effect from destination file/dir parent storage > policy. This rename operation is just a metadata change in Namenode. The > physical blocks still remain with source storage policy. > So, Tracking all such business logic based file names could be difficult for > admins from distributed nodes(ex: region servers) and running the Mover tool. > Here the proposal is to provide an API from Namenode itself for trigger the > storage policy satisfaction. A Daemon thread inside Namenode should track > such calls and process to DN as movement commands. > Will post the detailed design thoughts document soon. -- This message was sent by Atlassian JIRA (v6.3.4#6332)