[ https://issues.apache.org/jira/browse/HBASE-17671?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15875393#comment-15875393 ]
Bingbing Wang commented on HBASE-17671: --------------------------------------- Yes, I have checked hprof, and the most part of them is writeBuffer in org.apache.thrift.transport.TFramedTransport. Most of writeBuffer have exceeded 128M. I am very curious why such big writeBuffer is allocated and not cycled in time. Yes, we have close scanner on time. We can confirm this. Because we have use C++ auto-destructor when leaving life scope to ensure all scanner will be closed. We have ever fixed such bugs, so there should have no scanner leak in our application. Previous we have ever use CMS, but many Java GC issues. Later we switched to G1GC and do some adjustments, now the issue have been less than previous. > HBase Thrift2 OutOfMemory > ------------------------- > > Key: HBASE-17671 > URL: https://issues.apache.org/jira/browse/HBASE-17671 > Project: HBase > Issue Type: Bug > Components: Thrift > Affects Versions: 0.98.6 > Environment: Product > Reporter: Bingbing Wang > Priority: Critical > Attachments: hbase-site.xml, hbase-thrift2.log, log_gc.log.0.zip > > > We have a HBase Thrift2 server deployed on Windows, basically the physical > view looks like: > QueryEngine <==> HBase Thrift2 <==> HBase cluster > Here QueryEngine is a C++ application, and HBase cluster is a about 50-nodes > HBase cluster (CDH 5.3.3, namely Hbase version 0.98.6). > Our Thrift2 Java options looks like: > -server -Xms4096m -Xmx4096m -XX:MaxDirectMemorySize=8192m > -XX:+HeapDumpOnOutOfMemoryError -XX:+UseG1GC -XX:+ParallelRefProcEnabled > -XX:G1HeapRegionSize=4M -XX:InitiatingHeapOccupancyPercent=40 > -XX:+PrintAdaptiveSizePolicy -XX:+PrintPromotionFailure > -Dhbase.log.dir=d:\vhayu\thrift2\log -verbose:gc -XX:+PrintGCDateStamps > -XX:+PrintGCTimeStamps -XX:+PrintGCDetails -XX:PrintFLSStatistics=1 > -Xloggc:log_gc.log -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=10 > -XX:GCLogFileSize=200M -Dhbase.log.file=hbase-thrift2.log > -Dhbase.home.dir=D:\vhayu\thrift2\hbase0.98 -Dhbase.id.str=root -Dlog4j.info > -Dhbase.root.logger=INFO,DRFA -cp > "d:\vhayu\thrift2\hbase0.98\*;d:\vhayu\thrift2\conf" > org.apache.hadoop.hbase.thrift2.ThriftServer -b 127.0.0.1 -f framed start > The phenomenon of the issue is that after some time running, Thrift2 > sometimes reports OOM and heap dump file (.hprof) file was generated. The > consequence of this will always trigger high latency form HBase cluster. -- This message was sent by Atlassian JIRA (v6.3.15#6346)