Mithun Radhakrishnan created HIVE-3098:
------------------------------------------
Summary: Memory leak from large number of FileSystem instances in
FileSystem.CACHE. (Must cache UGIs.)
Key: HIVE-3098
URL: https://issues.apache.org/jira/browse/HIVE-3098
Project: Hive
Issue Type: Bug
Components: Shims
Affects Versions: 0.9.0
Environment: Running with Hadoop 20.205.0.3+ / 1.0.x with security
turned on.
Reporter: Mithun Radhakrishnan
Assignee: Mithun Radhakrishnan
The problem manifested from stress-testing HCatalog 0.4.1 (as part of testing
the Oracle backend).
The HCatalog server ran out of memory (-Xmx2048m) when pounded by 60-threads,
in under 24 hours. The heap-dump indicates that hadoop::FileSystem.CACHE had
1000000 instances of FileSystem, whose combined retained-mem consumed the
entire heap.
It boiled down to hadoop::UserGroupInformation::equals() being implemented such
that the "Subject" member is compared for equality ("=="), and not equivalence
(".equals()"). This causes equivalent UGI instances to compare as unequal, and
causes a new FileSystem instance to be created and cached.
The UGI.equals() is so implemented, incidentally, as a fix for yet another
problem (HADOOP-6670); so it is unlikely that that implementation can be
modified.
The solution for this is to check for UGI equivalence in HCatalog (i.e. in the
Hive metastore), using an cache for UGI instances in the shims.
I have a patch to fix this. I'll upload it shortly. I just ran an overnight
test to confirm that the memory-leak has been arrested.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira