hi, fot your first question, if you deploy QJM ha, it doesnt need share highly-reliable sophisticated storage.
--Send from my Sony mobile. On Jun 30, 2013 11:38 AM, "Yonghwan Kim (JIRA)" <j...@apache.org> wrote: > Yonghwan Kim created HDFS-4945: > ---------------------------------- > > Summary: A Distributed and Cooperative NameNode Cluster for a > Highly-Available HDFS > Key: HDFS-4945 > URL: https://issues.apache.org/jira/browse/HDFS-4945 > Project: Hadoop HDFS > Issue Type: New Feature > Components: auto-failover > Affects Versions: HA branch (HDFS-1623) > Reporter: Yonghwan Kim > > > Recently, Hadoop attracts much attention of engineers and researchers as > an emerging and effective framework for Big Data. > HDFS(Hadoop Distributed File System) can manage huge amount of data with > guaranteeing high performance and reliability > with only commodity hardware. > > However, HDFS requires a single master node, called NameNode, to manage > the entire namespace (or all the i-nodes) > of a file system. This causes SPOF (Single Point Of Failure) problem > because the file system becomes inaccessible > when the NameNode fails. (HDFS-2064) > > This also causes a bottleneck of efficiency since all the access requests > to the file system have to contact the > NameNode. Hadoop 2.0 resolves the SPOF problem by introducing manual > failover based on two NameNodes, Active and Standby. > However, it still has the efficiency bottleneck problem since all the > access requests have to contact the Active > in ordinary executions. It may also lose an advantage of using commodity > hardware since the two NameNodes have to > share a highly-reliable sophisticated storage. > > We here propose a new HDFS architecture to resolve all the problems > mentioned above. > The proposed architecture has the following features and advantages. > > 1. Multiple NameNodes (not restricted to two) can be utilized to improve > availability. > The entire namespace of a file system is partitioned into several > fragments, and replicas of each fragment are > dispersed among the NameNodes. When each fragment has k replicas, the > file system can tolerate up to > floor(k/2 - 1) faulty NameNodes. > > 2. Multiple NameNodes can be utilized to improve performance. The > performance bottleneck caused by a single > NameNode can be circumvented by assigning different NameNodes to different > fragments as the primary ones > (or the entry points). > > 3. The highly-reliable storage shared by the NameNodes is removed by > introducing message-based consistency > mechanism among the NameNodes. The architecture requires only commodity > hardware. > > > -- > This message is automatically generated by JIRA. > If you think it was sent incorrectly, please contact your JIRA > administrators > For more information on JIRA, see: http://www.atlassian.com/software/jira >