[ 
https://issues.apache.org/jira/browse/HADOOP-13442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daryn Sharp updated HADOOP-13442:
---------------------------------
    Attachment: HADOOP-13442.patch

# Changed group provider to cache the de-dupped list instead of the raw list.
# Added new {{UGI#getGroups}} that returns the aforementioned de-duped list
# Changed {{UGI#getPrimaryGroup}} to call {{UGI#getGroups}} to avoid an array 
copy
# Removed unnecessary synchronization of {{UGI#getGroups}} method.  Required 
minor tweak to {{Groups#getGroups}} to be thread-safe.  Already used elsewhere 
w/o synch, so this just makes it safe.  Reduces contention with cached 
token->ugi instances.

> Optimize UGI group lookups
> --------------------------
>
>                 Key: HADOOP-13442
>                 URL: https://issues.apache.org/jira/browse/HADOOP-13442
>             Project: Hadoop Common
>          Issue Type: Improvement
>            Reporter: Daryn Sharp
>            Assignee: Daryn Sharp
>         Attachments: HADOOP-13442.patch
>
>
> {{UGI#getGroups}} and its usage is inefficient.  The list is unnecessarily 
> converted to multiple collections.
> For _every_ invocation, the {{List<String>}} from the group provider is 
> converted into a {{LinkedHashSet<String>}} (to de-dup), back to a 
> {{String[]}}.  Then callers testing for group membership convert back to a 
> {{List<String>}}.  This should be done once to reduce allocations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

Reply via email to