[ https://issues.apache.org/jira/browse/HDFS-2832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13759132#comment-13759132 ]
Eric Sirianni commented on HDFS-2832: ------------------------------------- Arpit and team - nice document. A few questions and comments: # Does the design allow a _single_ datanode to have multiple replicas of a block (presumably each on different storage types - e.g. SSD and HDD)? If so _(and I think it should)_, this would seem to require some refactoring of the {{FSDatasetInterface}} which is oriented around the fact that a block maps to a single volume (e.g. {{FSVolume getVolume(Block b)}}). # Does the design intend to allow for Storage types (e.g. RAM) to be backed by non-file-addressable stores? If so _(and I think it should)_, this would also require some redesign of some areas: #* The {{FSDatasetInterface}} abstraction which allows for pluggable (i.e. non-file-addressable) block storage mechanisms is global to the DataNode. Perhaps it should be pluggable on a _per-storage_ basis - e.g. having a {{MemoryFSDataset}} and a {{FileFSDataset}} implementation co-existing within a single DataNode instance. Thinking about this some more, this might also help address my point above re: {{FSDatasetInterface}} being oriented around a single {{FSVolume}} per block. #* Areas where File-addressable block access is assumed outside the {{FSDatasetInterface}} abstraction: #** {{BlockSender}} / {{BlockReceiver}} which downcast to FileInputStreams to obtain the underlying FD #** {{DataStorage.linkBlocks}} which uses hardlinks for upgrade/revert scenarios > Enable support for heterogeneous storages in HDFS > ------------------------------------------------- > > Key: HDFS-2832 > URL: https://issues.apache.org/jira/browse/HDFS-2832 > Project: Hadoop HDFS > Issue Type: New Feature > Affects Versions: 0.24.0 > Reporter: Suresh Srinivas > Assignee: Suresh Srinivas > Attachments: 20130813-HeterogeneousStorage.pdf > > > HDFS currently supports configuration where storages are a list of > directories. Typically each of these directories correspond to a volume with > its own file system. All these directories are homogeneous and therefore > identified as a single storage at the namenode. I propose, change to the > current model where Datanode * is a * storage, to Datanode * is a collection > * of strorages. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira