[ 
https://issues.apache.org/jira/browse/HBASE-4633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13162475#comment-13162475
 ] 

Shrijeet Paliwal commented on HBASE-4633:
-----------------------------------------

Recent updates: 
* In my case the leak/memory-hold is not in HBase client. I could not find 
enough evidence to conclude that. What I did find is, our application holds one 
heavy object in memory. This object is shared between threads. Every N minutes 
the application creates a new instance of this class. Unless any thread is 
still holding on to an old instance, all old instances are GCed in time. Hence 
in theory at any time there should be only one active instance of heavy object. 

* Under heavy load and client operation RPC timeout enabled, some threads get 
stuck. This causes multiple instances of heavy object. In turn heap grows. 

After reading client code multiple times I can not gather why there will be a 
case when application thread will get stuck for several minutes. We have safe 
guards to clean up calls 'forcefully' if they have been alive for more than rpc 
timeout interval. 

I had planned to update the title of Jira to reflect above finding but 
Gaojinchao observed something interesting at his end and so keeping title same 
for now. Gaojinchao's thread is here: http://search-hadoop.com/m/teczL8KvcH

                
> Potential memory leak in client RPC timeout mechanism
> -----------------------------------------------------
>
>                 Key: HBASE-4633
>                 URL: https://issues.apache.org/jira/browse/HBASE-4633
>             Project: HBase
>          Issue Type: Bug
>          Components: client
>    Affects Versions: 0.90.3
>         Environment: HBase version: 0.90.3 + Patches , Hadoop version: CDH3u0
>            Reporter: Shrijeet Paliwal
>
> Relevant Jiras: https://issues.apache.org/jira/browse/HBASE-2937,
> https://issues.apache.org/jira/browse/HBASE-4003
> We have been using the 'hbase.client.operation.timeout' knob
> introduced in 2937 for quite some time now. It helps us enforce SLA.
> We have two HBase clusters and two HBase client clusters. One of them
> is much busier than the other.
> We have seen a deterministic behavior of clients running in busy
> cluster. Their (client's) memory footprint increases consistently
> after they have been up for roughly 24 hours.
> This memory footprint almost doubles from its usual value (usual case
> == RPC timeout disabled). After much investigation nothing concrete
> came out and we had to put a hack
> which keep heap size in control even when RPC timeout is enabled. Also
> note , the same behavior is not observed in 'not so busy
> cluster.
> The patch is here : https://gist.github.com/1288023

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to