[ https://issues.apache.org/jira/browse/HDFS-5711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14708941#comment-14708941 ]
nijel commented on HDFS-5711: ----------------------------- Its a long pending issue :) recently i analyzed a similar requirement to improve the NN memory footprint and to HDFS cluster startup time. The initial analysis was to keep the blockmap persisted in a memory cache with persistence support. Also only recent activities can be kept in memory. with HDFS-395 in place, NN can keep only the recent activities in memory. Any thoughts ? > Removing memory limitation of the Namenode by persisting Block - Block > location mappings to disk. > ------------------------------------------------------------------------------------------------- > > Key: HDFS-5711 > URL: https://issues.apache.org/jira/browse/HDFS-5711 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: namenode > Reporter: Rohan Pasalkar > Assignee: Ajith S > Priority: Minor > > This jira is to track changes to be made to remove HDFS name-node memory > limitation to hold block - block location mappings. > It is a known fact that the single Name-node architecture of HDFS has > scalability limits. The HDFS federation project alleviates this problem by > using horizontal scaling. This helps increase the throughput of metadata > operation and also the amount of data that can be stored in a Hadoop cluster. > The Name-node stores all the filesystem metadata in memory (even in the > federated architecture), the > Name-node design can be enhanced by persisting part of the metadata onto > secondary storage and retaining > the popular or recently accessed metadata information in main memory. This > design can benefit a HDFS deployment > which doesn't use federation but needs to store a large number of files or > large number of blocks. Lin Xiao from Hortonworks attempted a similar > project [1] in the Summer of 2013. They used LevelDB to persist the Namespace > information (i.e file and directory inode information). > A patch with this change is yet to be submitted to code base. We also intend > to use LevelDB to persist metadata, and plan to > provide a complete solution, by not just persisting the Namespace > information but also the Blocks Map onto secondary storage. > We did implement the basic prototype which stores the block-block location > mapping metadata to the persistent key-value store i.e. levelDB. Prototype > also maintains the in-memory cache of the recently used block-block location > mappings metadata. > References: > [1] Lin Xiao, Hortonworks, Removing Name-node’s memory limitation, HDFS-5389, > http://www.slideshare.net/ydn/hadoop-meetup-hug-august-2013-removing-the-namenodes-memory-limitation. > -- This message was sent by Atlassian JIRA (v6.3.4#6332)