[jira] [Commented] (HDFS-12090) Handling writes from HDFS to Provided storages

Rakesh R (JIRA) Tue, 18 Jul 2017 20:52:29 -0700

    [ 
https://issues.apache.org/jira/browse/HDFS-12090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16092534#comment-16092534
 ]


Rakesh R commented on HDFS-12090:
---------------------------------

bq. That's what the -createMountOnly flag is expected to do. Are you referring 
to something else? (Sorry about the confusion).
I've referred below sections in the design doc and it looks to me that user has 
to set the PROVIDED storage policy explicitly.
{code}
2. Set the StoragePolicy of hdfs://data/2016/jan/ to fDISK:2, PROVIDED:1g. This 
starts backing
up data in hdfs://data/2016/jan/.
...
...
If the -createMountOnly flag is specified
with the mount command, the MountTask is not created as the data will have to 
be separately backed up by the
user or administrator (by setting the StoragePolicy to include PROVIDED).
{code}
I thought of passing another optional argument {{-storagePolicy}} to the mount 
cmd and user get the chance to pass the desired policies like, {{DISK:2, 
PROVIDED:1}} or {{SSD:1, PROVIDED:1}} etc. Basically, this can serve as the 
default storage policy and all sub-dirs will inherit from the parent/root path. 
Again, user can override this storage policy in the sub-dirs, if needed.

{code}
hdfs mount hdfs://data/ adl://backup/ -createMountOnly [-storagePolicy 
<storagePolicy>] -backup
{code}

bq. Agreed. If there is interest, we can definitely do this.
Probably, we could capture this in the design doc. Maybe, could do in phase-2 
and later, based on the bandwidth we could take this.

bq. The Datanodes can then "mount" the new volume which can serve blocks in the 
particular location
OK, its interesting. Adding another point for discussion. So, this requires 
user intervention to configure the volume details and reload data volume, 
right?. Secondly, are you saying that {{user mount Vs volume}} is one-to-one 
mapping(I meant, for each mount point admin need to define a unique volume)?. 
IMHO, this can be one-to-many mapping. The datanodes which are aware about the 
respective FileSystem can access the data using the mount details(remote 
bucket/path etc). 
For example, we have a volume (WASB) defined at the datanode and this can serve 
any write/read request to the Azure FS from different WASB mount points. I 
agree, this will have sequential access pattern. Also, we need to limit the 
number of mount points to a single volume in order to control the {{many}} 
mappings. Welcome thoughts.

> Handling writes from HDFS to Provided storages
> ----------------------------------------------
>
>                 Key: HDFS-12090
>                 URL: https://issues.apache.org/jira/browse/HDFS-12090
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>            Reporter: Virajith Jalaparti
>         Attachments: HDFS-12090-design.001.pdf
>
>
> HDFS-9806 introduces the concept of {{PROVIDED}} storage, which makes data in 
> external storage systems accessible through HDFS. However, HDFS-9806 is 
> limited to data being read through HDFS. This JIRA will deal with how data 
> can be written to such {{PROVIDED}} storages from HDFS.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-12090) Handling writes from HDFS to Provided storages

Reply via email to