[ 
https://issues.apache.org/jira/browse/HDFS-7836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14335747#comment-14335747
 ] 

Charles Lamb commented on HDFS-7836:
------------------------------------

Problem Statement

The number of blocks stored by the largest HDFS clusters continues to increase. 
 This increase adds pressure to the BlockManager, that part of the NameNode 
which handles block data from across the cluster.

Full block reports are problematic.  The more blocks each DataNode has, the 
longer it takes to process a full block report from that DataNode.  Storage 
densities have roughly doubled each year for the past few years.  Meanwhile, 
increases in CPU power have come mostly in the form of additional cores rather 
than faster clock speeds.  Currently, the NameNode cannot use these additional 
cores because full block reports are processed while holding the namesystem 
lock.

The BlockManager stores all blocks in memory and this contributes to a large 
heap size.  As the NameNode Java heap size has grown, full garbage collection 
events have started to take several minutes.  Although it is often possible to 
avoid full GCs by re-using Java objects, they remain an operational concern for 
administrators.  They also contribute to a long NameNode startup time, 
sometimes measured in tens of minutes for the biggest clusters.


Goals
We need to improve the BlockManager to handle the challenges of the next few 
years.  Our specific goals for this project are to:

* Reduce lock contention for the FSNamesystem lock
* Enable concurrent processing of block reports
* Reduce the Java heap size of the NameNode
* Optimize the use of network resources

[~cmccabe] and I will be working on this Jira. We propose doing this work on a 
separate branch. If there is interest in a community meeting to discuss these 
changes, then perhaps Tuesday 3/10/15 at Cloudera in Palo Alto, CA would work? 
I suggest that date because I will be in the bay area that day and would like 
to meet with other interested community members in person. I'll also be around 
3/11 and 3/12 if we need an alternate date.


> BlockManager Scalability Improvements
> -------------------------------------
>
>                 Key: HDFS-7836
>                 URL: https://issues.apache.org/jira/browse/HDFS-7836
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>            Reporter: Charles Lamb
>            Assignee: Charles Lamb
>
> Improvements to BlockManager scalability.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to