[
https://issues.apache.org/jira/browse/HIVE-3098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13402716#comment-13402716
]
Daryn Sharp commented on HIVE-3098:
-----------------------------------
Unfortunately, no. If you make a {{UGI}} immutable, it really means marking
the {{Subject}} immutable. Everything token-based breaks because they can't be
added to the {{Subject}}, you can't login from a keytab or relogin because you
can't update the TGT in the {{Subject}}, proxy users (any maybe spnego?) won't
work because service tickets can't be added to the {{Subject}}.
A {{Subject}}/{{UGI}} is an execution context with specific privileges. Those
contexts cannot be cached and shared w/o risking escalated privileges. Think of
it this way: If I entrust you with the keys to my home to pickup a delivery, I
don't want you to make a copy of the keys and have the ability to enter anytime
you want w/o my explicit permission.
Without knowing the intricacies, I recommend: leaving the fs cache on, create a
new ugi for connections, and {{closeAllForUGI}} when the request is complete.
> Memory leak from large number of FileSystem instances in FileSystem.CACHE.
> (Must cache UGIs.)
> ---------------------------------------------------------------------------------------------
>
> Key: HIVE-3098
> URL: https://issues.apache.org/jira/browse/HIVE-3098
> Project: Hive
> Issue Type: Bug
> Components: Shims
> Affects Versions: 0.9.0
> Environment: Running with Hadoop 20.205.0.3+ / 1.0.x with security
> turned on.
> Reporter: Mithun Radhakrishnan
> Assignee: Mithun Radhakrishnan
> Attachments: HIVE-3098.patch
>
>
> The problem manifested from stress-testing HCatalog 0.4.1 (as part of testing
> the Oracle backend).
> The HCatalog server ran out of memory (-Xmx2048m) when pounded by 60-threads,
> in under 24 hours. The heap-dump indicates that hadoop::FileSystem.CACHE had
> 1000000 instances of FileSystem, whose combined retained-mem consumed the
> entire heap.
> It boiled down to hadoop::UserGroupInformation::equals() being implemented
> such that the "Subject" member is compared for equality ("=="), and not
> equivalence (".equals()"). This causes equivalent UGI instances to compare as
> unequal, and causes a new FileSystem instance to be created and cached.
> The UGI.equals() is so implemented, incidentally, as a fix for yet another
> problem (HADOOP-6670); so it is unlikely that that implementation can be
> modified.
> The solution for this is to check for UGI equivalence in HCatalog (i.e. in
> the Hive metastore), using an cache for UGI instances in the shims.
> I have a patch to fix this. I'll upload it shortly. I just ran an overnight
> test to confirm that the memory-leak has been arrested.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira