[ 
https://issues.apache.org/jira/browse/HDFS-2832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13748975#comment-13748975
 ] 

Andrew Wang commented on HDFS-2832:
-----------------------------------

I finally got a chance to read this doc, nice work. A few questions:

- For quota management, have you considered the YARN-like abstraction of users 
and pools? We're moving down that avenue in HDFS-4949, and it'd be nice to 
eventually have a single abstraction if we can. I get that for a first cut, 
it's easier to stick with the existing disk quota system.
- How do you expect applications to handle runtime failures? If I have a stream 
open and my write fails due to lack of SSD quota, can I change it to retry the 
write to HDD? Do I get metrics so I can alert somewhere?
- How do you handle block migration of files opened by long-lived applications 
like HBase that also use short-circuit local reads? Let's say HBase initially 
writes all its files to SSD, then we want to periodically migrate them to HDD. 
HBase holds onto the SSD file descriptors indefinitely, preventing reclamation 
of SSD capacity.
- If "File Storage Preferences" are part of file metadata, what happens when 
the files are copied, or distcp'd to another cluster?
- Why do we want a default "Storage Preferences" specified on a directory? I'd 
actually prefer if we make applications explicitly request special treatment 
when they open a stream.
- Let's say I'm a cluster operator, and have nodes with both PCI-e and SATA 
SSDs. Can I differentiate between them? How about if I add nodes with an 
unknown StorageType like NVRAM? Basically: what's required to add a new 
StorageType?
- Also related, when I bring up a new StorageType in my cluster, how do I make 
my applications start using it? Do I need to submit patches to HBase to now 
know how to use NVRAM properly? This seems like one of the downsides of 
physical storage types, logical means apps can do this more automatically.
                
> Enable support for heterogeneous storages in HDFS
> -------------------------------------------------
>
>                 Key: HDFS-2832
>                 URL: https://issues.apache.org/jira/browse/HDFS-2832
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>    Affects Versions: 0.24.0
>            Reporter: Suresh Srinivas
>            Assignee: Suresh Srinivas
>         Attachments: 20130813-HeterogeneousStorage.pdf
>
>
> HDFS currently supports configuration where storages are a list of 
> directories. Typically each of these directories correspond to a volume with 
> its own file system. All these directories are homogeneous and therefore 
> identified as a single storage at the namenode. I propose, change to the 
> current model where Datanode * is a * storage, to Datanode * is a collection 
> * of strorages. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to