[ https://issues.apache.org/jira/browse/HDFS-13209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16387507#comment-16387507 ]
Rakesh R commented on HDFS-13209: --------------------------------- {quote}However, sometime, we might need to keep all files in the same directory (consistency constraint) but might want some of them on SSD (small, in my case) until they are processed and merger/removed. Then they will go on the default policy. {quote} User can sets StoragePolicy to either a directory or a file, [fs#setStoragePolicy|https://hadoop.apache.org/docs/stable/api/org/apache/hadoop/fs/FileSystem.html#setStoragePolicy(org.apache.hadoop.fs.Path,%20java.lang.String)]. I agree with you, presently there is no option to pass storage policy during a file creation and newly created file inherits the storage policy from its parent directory and continue writing blocks using this storage policy. I'm not against this new API proposal, but I could see this behavior could be achieved with an additional cost of FileSystem API call. How about changing storage policy on a file, before writing contents to it. I'm trying an attempt to describe the steps, please go through and let me know if I missed anything. {code:java} Step-1) Assume parent directory "/myparent" configured with ALL_SSD policy. Step-2) Now, creates a file "/myparent/myfile" under "/myparent" dir. It inherits ALL_SSD policy from its parent. Step-3) Change storage policy of "/myparent/myfile" to "COLD" storage policy, which uses ARCHIVE storage type. Step-4) Writes data to the file. Here, the data blocks will be written to ARCHIVE storage types. {code} {code:java} Sample Code:- String fileName = "/myparent/myfile"; final FSDataOutputStream out = dfs.create(new Path(fileName), replicatonFactor); dfs.setStoragePolicy(new Path(fileName), "COLD"); for (int i = 0; i < 1024; i++) { out.write(i); } out.close(); {code} > DistributedFileSystem.create should allow an option to provide StoragePolicy > ---------------------------------------------------------------------------- > > Key: HDFS-13209 > URL: https://issues.apache.org/jira/browse/HDFS-13209 > Project: Hadoop HDFS > Issue Type: New Feature > Components: hdfs > Affects Versions: 3.0.0 > Reporter: Jean-Marc Spaggiari > Priority: Major > > DistributedFileSystem.create allows to get a FSDataOutputStream. The stored > file and related blocks will used the directory based StoragePolicy. > > However, sometime, we might need to keep all files in the same directory > (consistency constraint) but might want some of them on SSD (small, in my > case) until they are processed and merger/removed. Then they will go on the > default policy. > > When creating a file, it will be useful to have an option to specify a > different StoragePolicy... -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org