[ 
https://issues.apache.org/jira/browse/PHOENIX-4110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16138621#comment-16138621
 ] 

Samarth Jain commented on PHOENIX-4110:
---------------------------------------

Looks like this change didn't help. I ran the suite locally and monitored the 
java heap of the forked processes. And I saw that even though we are shutting 
down the mini-cluster more often, the heap memory keeps growing as tests 
progress. So I took a heap dump of one of the JVMs and ran a profiler. I saw 
that instances of three objects - MetricsSystemImpl, HRegion and Configuration 
are occupying most of the memory (93%)

{code}
One instance of "org.apache.hadoop.metrics2.impl.MetricsSystemImpl" loaded by 
"sun.misc.Launcher$AppClassLoader @ 0x75a025670" occupies 201,306,192 (10.95%) 
bytes.

717 instances of "org.apache.hadoop.hbase.regionserver.HRegion", loaded by 
"sun.misc.Launcher$AppClassLoader @ 0x75a025670" occupy 1,218,750,256 (66.30%) 
bytes. 

2,040 instances of "org.apache.hadoop.conf.Configuration", loaded by 
"sun.misc.Launcher$AppClassLoader @ 0x75a025670" occupy 287,352,096 (15.63%) 
bytes. 
{code}

- MetricsSystemImpl is a singleton i.e. supposed to be created once. It doesn't 
get shutdown when the mini cluster is shutdown. An option would be for us to 
shut it down ourselves when we are shutting down the mini cluster.

- The bulk of the heap is occupied by HRegion objects. It looks like in certain 
cases when region server is being stopped, not all the regions are getting 
closed. On inspecting the path of strong references to HRegion, it seems to be 
coming from thread objects of the class JVMClusterUtil$RegionServerThread. 
Looking at the hbase code I see that that when region server starts, it 
registers it's thread to the jvm's shutdown hook mechanism. This reference 
sticks around even though the thread itself has terminated. So when the regions 
are not closed, this thread object keeps the HRegions in memory resulting in 
memory leak. I will file an HBase JIRA for this.

Note, this was for 0.98. I need to try it out with 1.3 also. Worst case, I 
think we may have to resort to halting the JVM after every test. Or maybe come 
up with a mechanism (with some help of surefire plugin) to do the JVM halt 
after every few runs. Or maybe just call System.gc() and hope for the best :)

Will keep digging.

> ParallelRunListener should monitor number of tables and not number of tests
> ---------------------------------------------------------------------------
>
>                 Key: PHOENIX-4110
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-4110
>             Project: Phoenix
>          Issue Type: Bug
>            Reporter: Samarth Jain
>            Assignee: Samarth Jain
>         Attachments: PHOENIX-4110.patch
>
>
> ParallelRunListener today monitors the number of tests that have been run to 
> determine when mini cluster should be shut down. This helps prevent our test 
> JVM forks running in OOM. A better heuristic would be to instead check the 
> number of tables that were created by tests. This way when a particular test 
> class has created lots of tables, we can shut down the mini cluster sooner.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to