[ https://issues.apache.org/jira/browse/HADOOP-13442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Daryn Sharp updated HADOOP-13442: --------------------------------- Attachment: HADOOP-13442.patch # Changed group provider to cache the de-dupped list instead of the raw list. # Added new {{UGI#getGroups}} that returns the aforementioned de-duped list # Changed {{UGI#getPrimaryGroup}} to call {{UGI#getGroups}} to avoid an array copy # Removed unnecessary synchronization of {{UGI#getGroups}} method. Required minor tweak to {{Groups#getGroups}} to be thread-safe. Already used elsewhere w/o synch, so this just makes it safe. Reduces contention with cached token->ugi instances. > Optimize UGI group lookups > -------------------------- > > Key: HADOOP-13442 > URL: https://issues.apache.org/jira/browse/HADOOP-13442 > Project: Hadoop Common > Issue Type: Improvement > Reporter: Daryn Sharp > Assignee: Daryn Sharp > Attachments: HADOOP-13442.patch > > > {{UGI#getGroups}} and its usage is inefficient. The list is unnecessarily > converted to multiple collections. > For _every_ invocation, the {{List<String>}} from the group provider is > converted into a {{LinkedHashSet<String>}} (to de-dup), back to a > {{String[]}}. Then callers testing for group membership convert back to a > {{List<String>}}. This should be done once to reduce allocations. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org