[ 
https://issues.apache.org/jira/browse/AMBARI-18694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaobing Zhou updated AMBARI-18694:
-----------------------------------
    Description: 
As HDFS-11047 reported, DirectoryScanner does scan by deep copying 
FinalizedReplica. In a deployment with 500,000+ blocks, we've seen the DN heap 
usage being accumulated to high peaks very quickly. Deep copies of 
FinalizedReplica will make DN heap usage even worse if directory scans are 
scheduled more frequently. 

Another factor is that huge number of ScanInfo instances corresponding to HDFS 
blocks are lingering in garbage to eat many heap memories until a full GC takes 
place.

This proposes adding JVM settings to force GC more frequently to release 
DataNode heap consumed as a result of two aforementioned reasons, i.e. add the 
options to HADOOP_DATANODE_OPTS
{noformat}
-XX:CMSInitiatingOccupancyFraction=70 -XX:+UseCMSInitiatingOccupancyOnly 
-XX:ConcGCThreads=8 -XX:+UseConcMarkSweepGC
{noformat}


> Add JVM settings of forcing garbage collections to alleviate heap consumption 
> on DataNode
> -----------------------------------------------------------------------------------------
>
>                 Key: AMBARI-18694
>                 URL: https://issues.apache.org/jira/browse/AMBARI-18694
>             Project: Ambari
>          Issue Type: Improvement
>            Reporter: Xiaobing Zhou
>
> As HDFS-11047 reported, DirectoryScanner does scan by deep copying 
> FinalizedReplica. In a deployment with 500,000+ blocks, we've seen the DN 
> heap usage being accumulated to high peaks very quickly. Deep copies of 
> FinalizedReplica will make DN heap usage even worse if directory scans are 
> scheduled more frequently. 
> Another factor is that huge number of ScanInfo instances corresponding to 
> HDFS blocks are lingering in garbage to eat many heap memories until a full 
> GC takes place.
> This proposes adding JVM settings to force GC more frequently to release 
> DataNode heap consumed as a result of two aforementioned reasons, i.e. add 
> the options to HADOOP_DATANODE_OPTS
> {noformat}
> -XX:CMSInitiatingOccupancyFraction=70 -XX:+UseCMSInitiatingOccupancyOnly 
> -XX:ConcGCThreads=8 -XX:+UseConcMarkSweepGC
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to