[ https://issues.apache.org/jira/browse/HDFS-2832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13750715#comment-13750715 ]
Arpit Agarwal commented on HDFS-2832: ------------------------------------- Hi Andrew, Thanks for looking at doc, great questions. {quote} - For quota management, have you considered the YARN-like abstraction of users and pools? We're moving down that avenue in HDFS-4949, and it'd be nice to eventually have a single abstraction if we can. I get that for a first cut, it's easier to stick with the existing disk quota system. {quote} I am interested in seeing how the pools abstraction is defined and whether it can cover all our use cases. Do you have a design doc? We chose this approach because it extends the existing quota system and APIs and more importantly covers our use cases. {quote} - How do you expect applications to handle runtime failures? If I have a stream open and my write fails due to lack of SSD quota, can I change it to retry the write to HDD? Do I get metrics so I can alert somewhere? {quote} "Out of quota" is a hard failure just like hitting the disk space quota limit. The application must change the Storage Preference on the file to continue. We have not discussed metrics yet. {quote} - How do you handle block migration of files opened by long-lived applications like HBase that also use short-circuit local reads? Let's say HBase initially writes all its files to SSD, then we want to periodically migrate them to HDD. HBase holds onto the SSD file descriptors indefinitely, preventing reclamation of SSD capacity. {quote} Quota should be blocked indefinitely until the files can be moved off their current Storage Type. We did not cover this use case, so thanks for calling it out! I will make the update. {quote} - If "File Storage Preferences" are part of file metadata, what happens when the files are copied, or distcp'd to another cluster? {quote} I think this is a tools decision. We probably want to lose the File Attributes, with an option to preserve them. We will document this in more detail when we get to updating the tools. {quote} - Why do we want a default "Storage Preferences" specified on a directory? I'd actually prefer if we make applications explicitly request special treatment when they open a stream. {quote} Storage Preferences are not supported on directories. Please let me know if you see anything in the doc implying otherwise and I will fix it. {quote} - Let's say I'm a cluster operator, and have nodes with both PCI-e and SATA SSDs. Can I differentiate between them? How about if I add nodes with an unknown StorageType like NVRAM? Basically: what's required to add a new StorageType? - Also related, when I bring up a new StorageType in my cluster, how do I make my applications start using it? Do I need to submit patches to HBase to now know how to use NVRAM properly? This seems like one of the downsides of physical storage types, logical means apps can do this more automatically. {quote} Adding a new StorageType needs code and update to the StorageType enum. We made the trade-off for API and implementation simplicity for v1 but we are not ruling out adding support for logical classification in the future. > Enable support for heterogeneous storages in HDFS > ------------------------------------------------- > > Key: HDFS-2832 > URL: https://issues.apache.org/jira/browse/HDFS-2832 > Project: Hadoop HDFS > Issue Type: New Feature > Affects Versions: 0.24.0 > Reporter: Suresh Srinivas > Assignee: Suresh Srinivas > Attachments: 20130813-HeterogeneousStorage.pdf > > > HDFS currently supports configuration where storages are a list of > directories. Typically each of these directories correspond to a volume with > its own file system. All these directories are homogeneous and therefore > identified as a single storage at the namenode. I propose, change to the > current model where Datanode * is a * storage, to Datanode * is a collection > * of strorages. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira