[ 
https://issues.apache.org/jira/browse/HDFS-13209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16387507#comment-16387507
 ] 

Rakesh R commented on HDFS-13209:
---------------------------------

{quote}However, sometime, we might need to keep all files in the same directory 
(consistency constraint) but might want some of them on SSD (small, in my case) 
until they are processed and merger/removed. Then they will go on the default 
policy.
{quote}
User can sets StoragePolicy to either a directory or a file, 
[fs#setStoragePolicy|https://hadoop.apache.org/docs/stable/api/org/apache/hadoop/fs/FileSystem.html#setStoragePolicy(org.apache.hadoop.fs.Path,%20java.lang.String)].
 I agree with you, presently there is no option to pass storage policy during a 
file creation and newly created file inherits the storage policy from its 
parent directory and continue writing blocks using this storage policy. I'm not 
against this new API proposal, but I could see this behavior could be achieved 
with an additional cost of FileSystem API call.

How about changing storage policy on a file, before writing contents to it. I'm 
trying an attempt to describe the steps, please go through and let me know if I 
missed anything.
{code:java}
Step-1) Assume parent directory "/myparent" configured with ALL_SSD policy.
Step-2) Now, creates a file "/myparent/myfile" under "/myparent" dir. It 
inherits ALL_SSD policy from its parent.
Step-3) Change storage policy of "/myparent/myfile" to "COLD" storage policy, 
which uses ARCHIVE storage type.
Step-4) Writes data to the file. Here, the data blocks will be written to 
ARCHIVE storage types.
{code}
{code:java}
Sample Code:-

    String fileName = "/myparent/myfile";
    final FSDataOutputStream out = dfs.create(new Path(fileName),
        replicatonFactor);
    dfs.setStoragePolicy(new Path(fileName), "COLD");
    for (int i = 0; i < 1024; i++) {
      out.write(i);
    }
    out.close();
{code}

> DistributedFileSystem.create should allow an option to provide StoragePolicy
> ----------------------------------------------------------------------------
>
>                 Key: HDFS-13209
>                 URL: https://issues.apache.org/jira/browse/HDFS-13209
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>          Components: hdfs
>    Affects Versions: 3.0.0
>            Reporter: Jean-Marc Spaggiari
>            Priority: Major
>
> DistributedFileSystem.create allows to get a FSDataOutputStream. The stored 
> file and related blocks will used the directory based StoragePolicy.
>  
> However, sometime, we might need to keep all files in the same directory 
> (consistency constraint) but might want some of them on SSD (small, in my 
> case) until they are processed and merger/removed. Then they will go on the 
> default policy.
>  
> When creating a file, it will be useful to have an option to specify a 
> different StoragePolicy...



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to