[ https://issues.apache.org/jira/browse/HDFS-12090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16092534#comment-16092534 ]
Rakesh R commented on HDFS-12090: --------------------------------- bq. That's what the -createMountOnly flag is expected to do. Are you referring to something else? (Sorry about the confusion). I've referred below sections in the design doc and it looks to me that user has to set the PROVIDED storage policy explicitly. {code} 2. Set the StoragePolicy of hdfs://data/2016/jan/ to fDISK:2, PROVIDED:1g. This starts backing up data in hdfs://data/2016/jan/. ... ... If the -createMountOnly flag is specified with the mount command, the MountTask is not created as the data will have to be separately backed up by the user or administrator (by setting the StoragePolicy to include PROVIDED). {code} I thought of passing another optional argument {{-storagePolicy}} to the mount cmd and user get the chance to pass the desired policies like, {{DISK:2, PROVIDED:1}} or {{SSD:1, PROVIDED:1}} etc. Basically, this can serve as the default storage policy and all sub-dirs will inherit from the parent/root path. Again, user can override this storage policy in the sub-dirs, if needed. {code} hdfs mount hdfs://data/ adl://backup/ -createMountOnly [-storagePolicy <storagePolicy>] -backup {code} bq. Agreed. If there is interest, we can definitely do this. Probably, we could capture this in the design doc. Maybe, could do in phase-2 and later, based on the bandwidth we could take this. bq. The Datanodes can then "mount" the new volume which can serve blocks in the particular location OK, its interesting. Adding another point for discussion. So, this requires user intervention to configure the volume details and reload data volume, right?. Secondly, are you saying that {{user mount Vs volume}} is one-to-one mapping(I meant, for each mount point admin need to define a unique volume)?. IMHO, this can be one-to-many mapping. The datanodes which are aware about the respective FileSystem can access the data using the mount details(remote bucket/path etc). For example, we have a volume (WASB) defined at the datanode and this can serve any write/read request to the Azure FS from different WASB mount points. I agree, this will have sequential access pattern. Also, we need to limit the number of mount points to a single volume in order to control the {{many}} mappings. Welcome thoughts. > Handling writes from HDFS to Provided storages > ---------------------------------------------- > > Key: HDFS-12090 > URL: https://issues.apache.org/jira/browse/HDFS-12090 > Project: Hadoop HDFS > Issue Type: New Feature > Reporter: Virajith Jalaparti > Attachments: HDFS-12090-design.001.pdf > > > HDFS-9806 introduces the concept of {{PROVIDED}} storage, which makes data in > external storage systems accessible through HDFS. However, HDFS-9806 is > limited to data being read through HDFS. This JIRA will deal with how data > can be written to such {{PROVIDED}} storages from HDFS. -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org