[ 
https://issues.apache.org/jira/browse/THRIFT-1468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13174972#comment-13174972
 ] 

Aaron T. Myers commented on THRIFT-1468:
----------------------------------------

bq. I can confirm that the newest patch works. (This patch also has the 
comments you'd requested for.) Ran overnight, and the latest heap-dumps don't 
indicate a build-up of WeakHashMap$Entry objects.

That's great! Thanks a lot for testing that out, Mithun. The patch and the 
comment you added look good to me.
                
> Memory leak in TSaslServerTransport
> -----------------------------------
>
>                 Key: THRIFT-1468
>                 URL: https://issues.apache.org/jira/browse/THRIFT-1468
>             Project: Thrift
>          Issue Type: Bug
>          Components: Java - Library
>    Affects Versions: 0.5, 0.9
>            Reporter: Mithun Radhakrishnan
>              Labels: OOM, WeakHashMap, WeakReference
>         Attachments: Main.java, 
> THRIFT-1468-Memory_leak_in_TSaslServerTransport.patch, thrift-1468.patch
>
>
> I'm working on the HCatalog project. HCatalog uses a (slightly dated) version 
> of Hive that in turn depends on libthrift-0.5.0. The HCatalog-server is a 
> continuously running process that serves (meta)data over thrift. (The bug I 
> describe is related to HCATALOG-183.)
> We observed that on running the HCatalog-server with continuous 
> client-requests, the memory footprint of the server grows steadily, until we 
> see an OutOfMemoryError exception. I took a memory snapshot of the running 
> process, to check for leaks. I noticed that the majority of the memory (over 
> 1.3GB) was being consumed by the 
> org.apache.thrift.transport.TSaslServerTransport$Factory::transportMap. There 
> were over 52000 instances of WeakHashMap$Entry, consuming 3MB of 
> shallow-heap, and 1.3GB of retained heap.
> I suspect that entries in the WeakHashMap (transportMap) are not being 
> collected during GC, as is expected in code. That would only be so if there 
> are outstanding hard-references to the key in the map (TTransport).
> From the code in TSaslTransport and TSaslServerTransport, it appears that 
> there is an inadvertent cyclic reference that the runtime is unable to detect:
> 1. TSaslTransport has a (hard) back-reference to the "underlyingTransport", 
> i.e. TTransport.
> 2. TSaslServerTransport::Factory::transportMap is a WeakHashMap< TTransport, 
> TSaslServerTransport >. Here, the "underlyingTransport" is mapped back to the 
> decorating TSaslServerTransport.
> From #2, an entry can only be GCed if there's no outstanding hard-reference 
> to the TTransport. But from #1, the hard-reference comes from the value-part 
> of the hashmap entry. The runtime can't deduce that there's a cycle, 
> presumably because it's not explicit.
> (I'll be attaching a sample program to better illustrate the WeakHashMap 
> behaviour, in case I've botched the explanation above.)
> The simple solution would be to change the back-reference in #1 into a 
> WeakReference. I'll attach a patch here that might be suitable.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to