[ https://issues.apache.org/jira/browse/HDFS-7836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14335747#comment-14335747 ]
Charles Lamb commented on HDFS-7836: ------------------------------------ Problem Statement The number of blocks stored by the largest HDFS clusters continues to increase. This increase adds pressure to the BlockManager, that part of the NameNode which handles block data from across the cluster. Full block reports are problematic. The more blocks each DataNode has, the longer it takes to process a full block report from that DataNode. Storage densities have roughly doubled each year for the past few years. Meanwhile, increases in CPU power have come mostly in the form of additional cores rather than faster clock speeds. Currently, the NameNode cannot use these additional cores because full block reports are processed while holding the namesystem lock. The BlockManager stores all blocks in memory and this contributes to a large heap size. As the NameNode Java heap size has grown, full garbage collection events have started to take several minutes. Although it is often possible to avoid full GCs by re-using Java objects, they remain an operational concern for administrators. They also contribute to a long NameNode startup time, sometimes measured in tens of minutes for the biggest clusters. Goals We need to improve the BlockManager to handle the challenges of the next few years. Our specific goals for this project are to: * Reduce lock contention for the FSNamesystem lock * Enable concurrent processing of block reports * Reduce the Java heap size of the NameNode * Optimize the use of network resources [~cmccabe] and I will be working on this Jira. We propose doing this work on a separate branch. If there is interest in a community meeting to discuss these changes, then perhaps Tuesday 3/10/15 at Cloudera in Palo Alto, CA would work? I suggest that date because I will be in the bay area that day and would like to meet with other interested community members in person. I'll also be around 3/11 and 3/12 if we need an alternate date. > BlockManager Scalability Improvements > ------------------------------------- > > Key: HDFS-7836 > URL: https://issues.apache.org/jira/browse/HDFS-7836 > Project: Hadoop HDFS > Issue Type: Improvement > Reporter: Charles Lamb > Assignee: Charles Lamb > > Improvements to BlockManager scalability. -- This message was sent by Atlassian JIRA (v6.3.4#6332)