[ https://issues.apache.org/jira/browse/HDFS-3600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13406304#comment-13406304 ]
Laxman commented on HDFS-3600: ------------------------------ bq. -XX:+DisableExplicitGC This could be the possible culprit. Generally I noticed these kind of problems in DataNode and RegionServer processes. In these processes, native memory used heavily used via NIO and I have seen RegionServer(HBase) process consuming around 20+ GB of memory although its max heap is configured to 4GB (-Xmx) So, in order to keep the memory footprint(VIRT & RES values) in control, we need to configure MaxDirectMemorySize. At the same time, I observed that this direct memory is not part of heap and is getting collected with FullGC (When it reaches the limit or rmi server dgc interval) only. To conclude, configure MaxDirectMemorySize but DONT use DisableExplicitGC. @Brahma, can you please post your findings after removing this flag (DisableExplicitGC). > DataNode is not responding After throwing java.lang.OutOfMemoryError: Direct > buffer memory > ------------------------------------------------------------------------------------------ > > Key: HDFS-3600 > URL: https://issues.apache.org/jira/browse/HDFS-3600 > Project: Hadoop HDFS > Issue Type: Bug > Components: data-node > Affects Versions: 2.0.1-alpha > Reporter: Brahma Reddy Battula > > Scenario: > ========= > Started NN with four DN's > written client program such that it will keep on write,append and read dta > with 10 thread. > After 4 hours ,got OOME.Then DN listed under Dead it's not sending any > heartbeats but GC is happening. > *GC OPTS configured for DN* > -Xms3G -Xmx4G -XX:NewSize=256M -XX:MaxNewSize=512M -XX:PermSize=128M > -XX:MaxPermSize=128M -XX:CMSFullGCsBeforeCompaction=1 > -XX:MaxDirectMemorySize=1G -XX:+UseConcMarkSweepGC > -XX:+CMSParallelRemarkEnabled -XX:+UseCMSCompactAtFullCollection > -XX:CMSInitiatingOccupancyFraction=65 > -Xloggc:/home/install/hadoop/datanode/logs/datanode-root-gc.log > -XX:+PrintGCDetails -XX:+DisableExplicitGC > > *CPU usage for DN* > {noformat} > Tasks: 1 total, 0 running, 1 sleeping, 0 stopped, 0 zombie > Cpu(s): 0.1%us, 0.0%sy, 0.0%ni, 99.8%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st > Mem: 15955M total, 15806M used, 148M free, 436M buffers > Swap: 12284M total, 9M used, 12274M free, 11422M cached > PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND > > 7431 root 20 0 6291m 2.6g 13m S 0 16.4 55:20.65 java > {noformat} > *JAVA Version* > {noformat} > sun.boot.library.path = /root/nodesetup/java/jdk1.6.0_31/jre/lib/amd64 > java version "1.6.0_31" > Java(TM) SE Runtime Environment (build 1.6.0_31-b04) > Java HotSpot(TM) 64-Bit Server VM (build 20.6-b01, mixed mode) > {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira