[ https://issues.apache.org/jira/browse/HDFS-2832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13751056#comment-13751056 ]
Arpit Agarwal commented on HDFS-2832: ------------------------------------- Andrew, {quote} Grep for this line: "Storage preferences could be specified per-file or per-directory." {quote} I meant to say we considered both options but chose to go with per-file preferences. I should have worded it better. :-) {quote} Is this a stream API? The doc only mentions specifying storage preferences at create time via DFSClient#create. {quote} This is briefly mentioned in section 6.4 - _File Attribute APIs to query, set and clear file attributes which will be used to modify Storage Preferences_. We have describe File Attributes in detail before we start working on the feature API. {quote} I'm wondering how this works for the case where I write to SSD, run out of capacity, then want write the rest of my file to HDD. Do I need to close the file, modify the Storage Preference, then reopen for append? This also potentially requires migrating the last block to HDD, since storage types are tracked per-block, and then you might hit the "HBase keeps fds open forever" issue. {quote} We differentiate between running out of capacity and running out of quota (7.1, 1b and 1c). HDFS handles _Out of Capacity_ transparently by allocating subsequent blocks on HDD as fallback and does not require any migration to make forward progress. > Enable support for heterogeneous storages in HDFS > ------------------------------------------------- > > Key: HDFS-2832 > URL: https://issues.apache.org/jira/browse/HDFS-2832 > Project: Hadoop HDFS > Issue Type: New Feature > Affects Versions: 0.24.0 > Reporter: Suresh Srinivas > Assignee: Suresh Srinivas > Attachments: 20130813-HeterogeneousStorage.pdf > > > HDFS currently supports configuration where storages are a list of > directories. Typically each of these directories correspond to a volume with > its own file system. All these directories are homogeneous and therefore > identified as a single storage at the namenode. I propose, change to the > current model where Datanode * is a * storage, to Datanode * is a collection > * of strorages. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira