[
https://issues.apache.org/jira/browse/HIVE-7353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14087857#comment-14087857
]
Vaibhav Gumashta commented on HIVE-7353:
----------------------------------------
Thanks for the review comments. I've taken a different approach now and the
problem seems more generic. Here's the core issue:
RawStore is kept as a threadlocal variable. A RawStore object has a reference
to the JDOPersistanceManager object which JDOPersistanceManagerFactory caches.
To remove the JDOPersistanceManager from the cache, an explicit
JDOPersistanceManager#close call is required.
The issue is, that in HiveServer2, we keep 2 threadpools (handler - binary
mode/http mode & async) managed by an ExecutorService. Based on the config,
the threadpools keep a certain number of threads live and kill excess threads
after a configurable keepAliveTime expires. However, ExecutorService does not
provide a hook to plug in custom cleanup code when a thread is killed - ideally
this is where we'd plug in code to close the JDOPersistanceManager stored in
the threadlocal RawStore.
The current approach I've taken provides a custom ThreadFactory while creating
the threadpool, which has a finalize method that does the cleanup. The
ThreadFactory also maintains a map of RawStore object per Thread and in the
finalize method of each thread, retrieves the RawStore object from the map, and
performs the shutdown.
On another note, remote metastore also uses ExecutorService for maintaining its
ThreadPool. I haven't tested there, but similar problem should exist in that
case.
> HiveServer2 using embedded MetaStore leaks JDOPersistanceManager
> ----------------------------------------------------------------
>
> Key: HIVE-7353
> URL: https://issues.apache.org/jira/browse/HIVE-7353
> Project: Hive
> Issue Type: Bug
> Components: HiveServer2
> Affects Versions: 0.13.0
> Reporter: Vaibhav Gumashta
> Assignee: Vaibhav Gumashta
> Fix For: 0.14.0
>
> Attachments: HIVE-7353.1.patch, HIVE-7353.2.patch
>
>
> While using embedded metastore, while creating background threads to run
> async operations, HiveServer2 ends up creating new instances of
> JDOPersistanceManager rather than using the one from the foreground (handler)
> thread. Since JDOPersistanceManagerFactory caches JDOPersistanceManager
> instances, they are never GCed.
--
This message was sent by Atlassian JIRA
(v6.2#6252)