[ 
https://issues.apache.org/jira/browse/HIVE-20192?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16551100#comment-16551100
 ] 

Vihang Karajgaonkar commented on HIVE-20192:
--------------------------------------------

{quote} The PersistenceManagerFactory object "pmf" is a static object which 
keeps references of the allocated PersistenceManager in pmCache Map. That's why 
PersistenceManager doesn't get GC'ed and need explicit shutdown for any 
exception. In this case we retry instead of closing the thread which overwrites 
the pm object and leaks the old one. {quote}

I see. Thanks for the explanation.

{quote}I think, overwriting the entry by cacheThreadLocalRawStore doesn't cause 
any leak, because, it overwrites with thread local rawStore which is active in 
this thread. If the thread local rawStore is changed, it means, the older one 
was already shutdown gracefully before re-create. Also, threadRawStoreMap 
shouldn't pile up as we use the same thread id. {quote}

I think you are right. Looks like the model of cleaning up is optimistic in the 
sense in case the thread is reused, {{Hive#getInternal}} method does some 
checks to make sure if we can reuse this threadlocal rawstore and cleans it up 
in case the owner is different or the config is not compatible. So looks like 
we are good in case of thread re-use because the object which is being 
overwritten in the {{ThreadWithGarbageCleanup.threadRawStoreMap}} is either 
replaced with the same object or when the previous one was closed. So that code 
path looks good to me. This is all very tricky business and I hope there is no 
other code path which is still leaking the rawstore. 

This patch looks good to me. +1

> HS2 with embedded metastore is leaking JDOPersistenceManager objects.
> ---------------------------------------------------------------------
>
>                 Key: HIVE-20192
>                 URL: https://issues.apache.org/jira/browse/HIVE-20192
>             Project: Hive
>          Issue Type: Bug
>          Components: HiveServer2
>    Affects Versions: 3.0.0, 3.1.0, 4.0.0
>            Reporter: Sankar Hariappan
>            Assignee: Sankar Hariappan
>            Priority: Major
>              Labels: HiveServer2, pull-request-available
>             Fix For: 4.0.0
>
>         Attachments: HIVE-20192.01.patch
>
>
> Hiveserver2 instances where crashing every 3-4 days and observed HS2 in on 
> unresponsive state. Also, observed that the FGC collection happening regularly
> From JXray report it is seen that pmCache(List of JDOPersistenceManager 
> objects) is occupying 84% of the heap and there are around 16,000 references 
> of UDFClassLoader.
> {code:java}
> 10,759,230K (84.7%) Object tree for GC root(s) Java Static 
> org.apache.hadoop.hive.metastore.ObjectStore.pmf
> - org.datanucleus.api.jdo.JDOPersistenceManagerFactory.pmCache ↘ 10,744,419K 
> (84.6%), 1 reference(s)
>   - j.u.Collections$SetFromMap.m ↘ 10,744,419K (84.6%), 1 reference(s)
>     - {java.util.concurrent.ConcurrentHashMap}.keys ↘ 10,743,764K (84.5%), 
> 16,872 reference(s)
>       - org.datanucleus.api.jdo.JDOPersistenceManager.ec ↘ 10,738,831K 
> (84.5%), 16,872 reference(s)
>         ... 3 more references together retaining 4,933K (< 0.1%)
>     - java.util.concurrent.ConcurrentHashMap self 655K (< 0.1%), 1 object(s)
>       ... 2 more references together retaining 48b (< 0.1%)
> - org.datanucleus.api.jdo.JDOPersistenceManagerFactory.nucleusContext ↘ 
> 14,810K (0.1%), 1 reference(s)
> ... 3 more references together retaining 96b (< 0.1%){code}
> When the RawStore object is re-created, it is not allowed to be updated into 
> the ThreadWithGarbageCleanup.threadRawStoreMap which leads to the new 
> RawStore never gets cleaned-up when the thread exit.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to